Você está na página 1de 198

Winter Training, December 2011

Unix and Shell Programming


Department of COE and SE,
Delhi Technological University

Instructor: Divyashikha Sethia









Contents

UNIT 1: INTRODUCTION TO UNIX ..........................................................................3
UNIT 2: SHELL SCRIPTING ..................................................................................... 63
UNIT 3: ADVANCED SHELL SCRIPTING, SED, AND AWK .................. 143



UNIT 1: INTRODUCTION TO UNIX

1. THE UNIX OPERATING SYSTEM AN OVERVIEW.................................7
2. UNIX COMMANDS ................................................................................................... 21
3. UNIX FILE SYSTEM ................................................................................................ 33
4. THE VI TEXT EDITOR ............................................................................................ 45

COE
Unit 1, Lesson 1


LESSON 1 THE UNIX OPERATING SYSTEM AN
OVERVIEW

1. THE UNIX OPERATING SYSTEM AN OVERVIEW .................................................7
1.0 OBJECTIVES ...............................................................................................................7
1.1 INTRODUCTION ...........................................................................................................7
1.2 INTRODUCTION TO THE COMPUTERS .........................................................................7
1.2.1 Typical hardware components of a computer.................................................8
1.3 OPERATING SYSTEM ..................................................................................................8
1.3.1 Virtual Memory.....................................................................................................9
1.4 UNIX OPERATING SYSTEM .................................................................................... 10
1.4.1 History of UNIX ................................................................................................. 10
1.4.2 Importance of UNIX ......................................................................................... 11
1.5 UNIX OPERATING SYSTEM ATTRIBUTES AND COMPONENTS ............................ 12
1.6 STARTING WITH UNIX............................................................................................. 14
1.7 CHANGING YOUR PASSWORD ................................................................................ 15
1.8 ENTERING COMMANDS IN THE UNIX SYSTEM ....................................................... 16
1.8.1 Command Options and Arguments ............................................................... 17
1.9 SUMMING UP ........................................................................................................... 17
1.10 ANSWERS TO THE SELF CHECK QUESTIONS ........................................................... 17
1.11 TERMINAL QUESTIONS............................................................................................. 18
1.12 REFERENCES .......................................................................................................... 18


COE
Unit 1, Lesson 1

7
1. The UNIX Operating System An Overview




Use and i nfluence of computers has been steadily i ncreasing i n the last few
decades. Today, computers play a pivotal role in all walks of life. An operating
system (OS) is a core component of the computer system. An operating system lets
a computer function as multi-user, multitaski ng and multithreading environment, thus
augmenti ng the power of the computer. UNIX is an operati ng system that offers its
users all these capabilities along with numerous other features. In this lesson we will
look upon the features and components of the UNIX system that make it very useful
and popular. In the subsequent lessons we will explore the features and components
of UNIX i n more details.



1.0 Objectives

After goi ng through this lesson, you will be able to

Understand the concepts of the Operating System
Understand what is the UNIX Operating Systems
Understand the importance and popularity of UNIX Operating System
Understand how to start worki ng on a UNIX machi nes


1.1 Introduction

In the modern age, we have seen the computer doi ng wonders, from children
playi ng games to the scientists launching satellites; we can clearly see that
the computers are playi ng a important role. It is the operating system that has
made the computing i n the modern world possible and efficient.


1.2 Introduction to the computers

Unlike calculator, a computer carries out user specified tasks. An inherent
power provided by a computer is that it can be programmed to do variety of
tasks. Computers are mostly general purpose computers i n the sense that a
computer can be used to play a game and the same computer can be used to
perform a circuit simulation.

A computer consists of hardware and software. A computer can be defined as
a programmable machine which responds and executes a list of instructions.
These lists of instructions are called programs. The hardware components are
the physical components and software is data or instruction.
COE
Unit 1, Lesson 1

8

1.2.1 Typical hardware components of a computer

Hardware components i n computer are what you can see and touch.


Memory: Enables the computer to store the temporary data and instructions.
This is used in the computer during the execution of various i nstruction sets.





Mass storage devices: These are used for the bulk storage of data, such as,
disk drives and tape dri ves.
Input devices: Interface to take the instructions from the user to the computer.
Commonly used i nput devices are keyboard, mouse, web camera, etc.
Output Devices: Display the results of the instruction processi ng done by the
computer. Commonly used are display monitors and the printers.
Central Processing Unit (CPU): The brain of the computer i n which all the
processing is done. It reads the data from memory or i nput and executes the
instructions. CPU consists of ALU (Arithmetic Logic Unit) and CU (Control
Unit). ALU is responsible for all calculations and CU is responsible for getting
instructions and data for execution.

Working with the hardware components alone is very difficult because their
controls are very cryptic. Instead, software components are used to drive the
hardware components. The operating system is also one such software.


1.3 Operating System

An Operating System (OS) is an important program that runs on the
computer. An operati ng system performs the very basic tasks, such as
recogni zing inputs from the user, sendi ng outputs to the display, keeping track
of file and directories on the disk, and controlli ng the peripheral devices such
as the disk dri vers and printers.

While evaluati ng the following expression, the
intermediate results are stored in memory
Sum = 2 + 1 + 3 * 4

COE
Unit 1, Lesson 1

9


The OS also works as a traffic cop - it makes sure that different program and
users running at the same time do not interfere with each other. The operating
system is also responsible for security and blocking unauthorized users.

Operating systems can be classified as follows:

Multi-user: Allows multiple users to use computers at the same time.
Multiprocessing: Supports running parts of a program in parallel.
Multitasking: Allows multiple programs to run concurrently on a si ngle CPU.
Multithreading: Allows different parts of a single program to run concurrently.

Operating systems provide a platform on which other programs, called
application programs, can run. The application programs must be written to
run on a particular operating system. Your choice of operating system,
therefore, determi nes to a great extent the applications you can run. For PCs,
the popular operating systems are DOS, OS/2, Windows and Linux.

1.3.1 Virtual Memory

Programs that run on a computer may need more memory than what is
available physically on that computer. Many operati ng systems provide an
illusion to the user of much larger memory. This is done by loading only partial
program and data in physical memory. Only the parts that are needed for
current execution are brought into physical memory. So, bigger programs can
be run even if physical memory is small.


COE
Unit 1, Lesson 1

10


Self-Check Questions

1. A ____________ is a prerecorded set of instructions, which is executed by the
computer to perform some task.
2. A computer is a specific purpose machine that can not be tweaked to perform
some other tasks. (True/False)
3. The operating systems keep the temperature inside the computer down, so that
the functioning is proper. (True/False)
4. A ___________ system allows running parts of a program in parallel, on more
than one CPU.
5. In a _______________ system, a large number of users can use the system
concurrently.
6. The ____________ memory is an imagi nary memory which is used by the
Operating System to get a larger address space.



1.4 UNIX Operating System

1.4.1 History of UNIX

The UNIX operating system found its beginnings i n MULTICS, which stands
for Multiplexed Operating and Computing System. The MULTICS project
began in the mid 1960s as a joint effort by General Electric, Massachusetts
Institute for Technology and Bell Laboratories. In 1969 Bell Laboratories
pulled out of the project.

One of Bell Laboratories people i nvolved i n the project was Ken Thompson.
He liked the potential MULTICS had, but felt it was too complex and that the
same thi ng could be done in simpler way. In 1969 he wrote the first version of
UNIX, called UNICS. UNICS stood for Uniplexed Operating and Computing
System. Although the operati ng system has changed, the name stuck and
was eventually shortened to UNIX.

Ken Thompson teamed up with Dennis Ritchie, who wrote the first C compiler.
In 1973 they rewrote the UNIX core (called kernel) in C. The following year a
version of UNIX known as the Fifth Edition was fi rst licensed to uni versities.
The Seventh Edition, released in 1978, served as a dividing point for two
divergent lines of UNIX development. These two branches are known as
SVR4 (Release 4) and BSD.

Ken Thompson spent a year's sabbatical with the Uni versity of California at
Berkeley. While there are two graduate students, Bill Joy and Chuck Haley,
wrote the first Berkeley version of UNIX, which was distributed to students.
This resulted i n the source code being worked on and developed by many
different people. The Berkeley version of UNIX is known as BSD, Berkeley
COE
Unit 1, Lesson 1

11
Software Distribution. From BSD came the VI editor, C shell, virtual memory,
Send mail, and support for TCP/IP.

1.4.2 Importance of UNIX

During past 25 years the UNIX OS has evolved into powerful, flexible, and
versatile and robust operati ng system. It serves as the operating system for
variety of computers , for single user personal computers , engi neering
workstation , multi-user microcomputers, mi nicomputers, mai nframes, super
computers and as well as special application devices . There are
approximately 20 million machi nes now runni ng UNIX and more than 100
million users, and this popularity and rapid growth is estimated to be
increased further. The success of UNIX is due to many factors includi ng its
portability to a wide range of machines, its adaptability and simplicity, the wide
range of tasks it can perform, its multi-user and multitasking nature, and its
suitability for networking. What follows is a description of the features that
have made UNIX system so popular.

Multi-user and Multitasking abilities
The UNIX OS allows the use of a single computer by many users. It is also a
multitaski ng system that is it allows more than one application to be run on the
same computer at the same time.

Powerful command set
The UNIX OS provides a consistent and powerful set of commands that has
made it very useful particularly for the technical people.

Combining commands
The UNIX provides constructs like pipes and redirection of commands which
enables the user to create his own powerful utilities from UNIX commands.

Excellent environment for Networking
UNIX offers program and utilities that provide the services needed to build
networked applications - the basis for distributed, networked computing. With
networked computi ng, information and processing is shared amongst different
computers i n a network. It is useful i n client server computi ng where the
machi nes on the network can be client and servers at the same time. UNIX
system is used as the base system for the development of the internet
services and the growth of i nternet.

Portability
The UNIX system is far easy to be ported to new machines than other
operating systems. The fact that, it is portable to almost any computer, results
from its being almost entirely written in C programming language.

COE
Unit 1, Lesson 1

12

1.5 UNIX Operating System Attributes and Components

The UNIX operating system is made up of several major components. Some
of these components are the commands, the file system, the shell, the kernel
and the commands.
COE
Unit 1, Lesson 1

13

The Commands and User Programs

UNIX provides a number of built-in commands and in addition user programs
can also run.

The File System

The basic unit that stores information in the UNIX system is called a file. The
UNIX file system provides a logical method of organi zing files. Files are
organi zed in a hierarchical file system where the fi les are grouped together in
a directory.







An important simplifying feature of the UNIX system is the way it treats the
files. For example, physical devices are treated as fi les, this permits the same
command to work for an ordinary file or a device i.e. same command can be
used to write to a file and printer.

The Shell and shell scripts

The shell is the command interpreter in the UNIX operating system. It reads
the user specified commands and i nterprets them as requests to execute a
program or a set of programs, which it then arrange to carry them out. Shell
also provides a programming language. Shell scripts are covered in
subsequent chapters of this unit.

The kernel

The kernel is the core of the OS. The kernel i nteracts directly with the
hardware through a set of programs called the device drivers that are built into
the kernel. It provides the set of services that can be used by the other
programs; also it safeguards these programs from hardware layers. The major
functions of the kernel are to mai ntai n the file system, manage memory,
access control to the computer, and handle the interrupts (these are the
signals to terminate the processes, ctrl + C is a common example)., error
handli ng, I/O handling which enables the computer i nteraction with the
peripheral devices such as pri nters, monitors, storage devices, etc.).

Programs use kernel through the system calls. For example, if the user wants
some file to be opened then the program generates a system call to open the
directories and then the files.

The figure below shows the relationship amongst various components of the
UNIX file system.
Example: Hierarchical File Structure
/dtu/COE_Course/COE_101/schedule


Here dtu is the parent directory which is i n /
root and other directories are in it
COE
Unit 1, Lesson 1

14















Components of UNIX operati ng system (shown i n gray).



Self-check Questions

7. UNIX is a multi-user OS and also possesses multitasking abilities. (True/False)
8. The first version of the UNIX Operati ng System was known as _____________.
9. The file system in a UNIX Operating System is a hierarchical structure.
(True/False)
10. The ____________ in a UNIX Operating System is used to interact with the
hardware and executes the user commands and program.
11. The command interpreter i n the UNIX system is called ___________.
12. The programs in the UNIX systems interact usi ng the __________ calls with the
kernel to perform the tasks.




1.6 Starting with UNIX

This section is dedicated to the learning of how to log into a UNIX system and
how to change password on a UNIX system. We will touch the details of the
different types of system configurations and how we can log on to systems
having these configurations.

Selecting a login

Every UNIX user on a multi-user system is recogni zed by a login name which
is the only identity he has on the system. This is to be set before you use a
multi-user or a single user UNIX system, to log onto the system.

UNIX provides excellent built-in security. Therefore no users are permitted
unless they are identified. For this identification, each user has a login ID.
The User Commands
The Shell
The Kernel
Hardware
COE
Unit 1, Lesson 1

15

The login ID is typically allocated by an authority (known as the system
administrator). The system administrator is also responsible to add new users
to the system and provide them a login name and an i nitial work environment
and password on the computer.

UNIX shows a login prompt initially. User needs to type-i n his logi n ID. Then
the password prompt comes. After you correctly type in the password, you get
logged into the system. The example below shows this process.







Connecting to the UNIX System

In a multi-user system you have to contact the system admi nistrator as to how
you can connect to the system using your PC or terminal. Your PC can be
directly wired to a computer or it can be connected via LAN.

Direct Connect - This is a method of connecti ng to UNIX machines when
there is a si ngle machine.

Dial-in Access - You can dial i n to the UNIX network using a modem, use
terminal emulators to get the UNIX prompt.

Local Area Network (LAN) - LAN is a client server model. Connect to the
server using the client workstation and use the UNIX capabilities.

IP Networks
Using IP networks like internet one can connect to some remote machines
using telnet capability of UNIX.


1.7 Changing Your Password

Your password is very important information that you must not share with
anyone. You must change it regularly (say once in 2 months) and also should
remember it (you must not write it on paper). Your password should contain 6
to 8 letters and should not simply be your name, your date of birth, etc. Your
password should also contain at least one non alphabet (maybe a number).

To change the password of your login you can use the passwd command.

bash> password
password: Changing password for sushobhit
Old password:
New Password:
Re-enter new password:
bash>

login: akash
password:

akash is the user login name.
Note to keep password secure, it is not displayed when
you type it.
COE
Unit 1, Lesson 1

16
There is a simple scheme to create complex passwords and still remember
them! All you do is to take the first letters of a line of your favorite poem or
song and add a number or symbol to make a complex password. Here is an
example: Say you pick the like Twi nkle twinkle little star. Take the first
letters to makes a stri ng Ttls. And suppose your favorite symbol is = (equal
sign) and favorite number is 2 so you append these to the string to make your
complex password as Ttls2=. You can see that for anyone else it will too hard
to fi nd out while it is very easy for you to remember.

NOTE: If you forget your password it cannot be retrieved even by the system
administrator. The only remedy i n such cases is that the system administrator
can reset the password.



Self-Check Questions

13. ________________ is the program which is used to connect to the UNIX system
from a remote system.
14. ___________________ in a multi-user system is the person who is responsible
for mai ntai ning the system.
15. Get the odd one out
To connect to a UNIX system one of the followi ng measures can be used
a. Dial-in access
b. IP Networks
c. LAN
d. System Calls
16. If you forget your password system admi nistrator can give you permissions.
(True/False)




1.8 Entering Commands in the UNIX System

UNIX provides numerous commands. When the user types some command
on UNIX prompt then the shell i nvokes the program for the command, the
command program can invoke many system calls, these calls then interacts
with the hardware.
COE
Unit 1, Lesson 1

17

1.8.1 Command Options and Arguments

UNIX system has a standardized command syntax that is applicable to almost
all the UNIX commands. Every command has some base functionality and
additional functionality that are provided by the command line arguments.

For Example, the ls command can be used to list the contents of a directory.




Now lets use ls command with some option







This example shows the usage of l argument of ls command, which outputs
the long format of ls command.

Another command that is frequently used is man command. This is used to
displays the manuals of different commands.


1.9 Summing Up

An operati ng system is the most important software in any computer as it fills
the communication gap between a user and the underlyi ng hardware. UNIX
operating system with its unique qualities and ease to adapt is a popular and
powerful operati ng system now days. In the chapters to follow we will explore
the powers of UNIX i n some details.


1.10 Answers to the self check questions

1. program
2. False
3. False
4. multitaski ng
5. multi-user
6. virtual memory
7. True
8. MULTICS
9. True
10. Shell
11. Shell
12. System calls
bash> ls l
-rw-r--r-- 1 anmol friends 10777 Mar 30 16:26
README
-rw-r--r-- 1 achi nt friends 21483 Feb 28 17:39
2134.tar.gz
drwxr-xr-x 2 amit friends 4096 Dec 12 16:41
game_scores
drwx------ 3 arat friends 4096 May 10 2006
game_schedule
bash> ls
README 2134.tar.gz game_scores game_schedule
COE
Unit 1, Lesson 1

18
13. telnet
14. system admi nistrator
15. h
16. False


1.11 Terminal questions

1. List and expand briefly the components of the UNIX operating system.
2. What are the features of UNIX operating system that are the cause of its
popularity amongst the users?
3. Explai n briefly the possible modes to log onto a UNIX system


1.12 References

1. http://www.uwsg.iu.edu/usail/concepts/unixhx.html

COE Unit 1, Lesson 2


LESSON 2 UNIX COMMAND

2. UNIX COMMANDS ......................................................................................................... 21
2.0 OBJECTIVES ............................................................................................................ 21
2.1 INTRODUCTION ........................................................................................................ 21
2.2 THE COMMANDS CLASS .......................................................................................... 21
2.3 CONNECTING TO UNIX ........................................................................................... 22
2.3.1 telnet command ................................................................................................ 22
2.3.2 rlogin command ................................................................................................ 22
2.4 FILE MANAGEMENT ................................................................................................. 22
2.4.1 mv command..................................................................................................... 23
2.4.2 cp command...................................................................................................... 23
2.4.3 rm command ..................................................................................................... 23
2.5 A COMMUNICATION RELATED COMMAND - FTP ....................................................... 23
2.6 INFORMATION .......................................................................................................... 24
2.6.1 man command .................................................................................................. 24
2.6.2 du Disk usage ................................................................................................ 25
2.6.3 df Disk free ..................................................................................................... 25
2.6.4 quota................................................................................................................... 25
2.6.5 who Finding out who is logged on .............................................................. 25
2.7 PRINTING ................................................................................................................. 26
2.7.1 lpr Printing ...................................................................................................... 26
2.7.2 lprm Removing a printing job ...................................................................... 26
2.7.3 lpq Checking the printing queue ................................................................. 26
2.8 PROCESS CONTROL................................................................................................ 26
2.8.1 ps Finding the process ................................................................................. 26
2.8.2 & - Running process in background .............................................................. 27
2.8.3 Cntrl-z Suspending a processes................................................................. 27
2.8.4 Jobs Finding the process in background................................................... 27
2.8.5 Kill Killing a process...................................................................................... 27
2.8.6 nice reducing the priority of process .......................................................... 27
2.9 MISCELLANEOUS COMMANDS ................................................................................. 28
COE Unit 1, Lesson 2


2.9.1 alias / unalias command.................................................................................. 28
2.9.2 cal (calendar) command.................................................................................. 28
2.9.3 clear command ................................................................................................. 28
2.9.4 crontab command............................................................................................. 28
2.9.5 csh command.................................................................................................... 28
2.9.6 history command .............................................................................................. 29
2.9.7 date command .................................................................................................. 29
2.9.8 echo command ................................................................................................. 29
2.9.9 grep command .................................................................................................. 29
2.9.10 unset command ................................................................................................ 29
2.9.11 tar command .................................................................................................... 29
2.9.12 tee command .................................................................................................... 29
2.9.13 touch command ................................................................................................ 29
2.10 SUMMING UP ........................................................................................................... 30
2.11 ANSWERS TO THE SELF-CHECK QUESTIONS ........................................................... 30
2.12 TERMINAL QUESTIONS ............................................................................................ 30


COE Unit 1, Lesson 2

21
2. Unix Commands




UNIX as any other operating system provides a set of commands to its users, using
which, the users can perform the tasks they want. There is a huge variety of
commands that UNIX provides its user. In the present lesson we will discover and
read about the usage of many of the commands i n UNIX.



2.0 Objectives

After goi ng through this lesson, you will be able to

Use the UNIX commands to perform tasks
Understand how to send and recei ve mails on UNIX
Understand the file management basic command
Understand the information and communication system using the UNIX


2.1 Introduction

UNIX provides a number of commands. For the ease of understanding we can
divide these commands into various categories.


2.2 The Commands class

UNIX commands can be grouped amongst few broader classes:

Starting and Ending
These are the commands which are basically used to logon to the UNIX
system, or to i nitiate working on to the UNIX system.

File Management
File is the basic data holding entity i n the UNIX systems. There is a set of
commands that can be used to maintai n the file system so as to keep the data
stored in the files, secured, updated and maintai ned.

Communication
UNIX provides communications at many levels, i ncludi ng mails, writing
messages, exchangi ng files, etc.



COE Unit 1, Lesson 2

22
Information
UNIX provides a number of commands to get information about the system
like who are logged i n, how much disk space is available, etc.

Printing
In UNIX user can gi ve the pri nt command and also can monitor the status of
the job or can remove the job if required from the queue.

Job and Process control
As there are lots of processes which are going on i n a UNIX system, it is
sometimes required to get the information related to the user jobs running on
the system. For this purpose UNIX provides a set of commands to monitor,
kill, prioritize and resumi ng the jobs.

In the present chapter we will look at some of these commands in detail and
the other commands will be discussed in the chapters to follow.


2.3 Connecting to UNIX

Before we learn anythi ng in details the very first thing we wi ll look at is the
process that a user has to adopt to start with the UNIX system.

2.3.1 telnet command

The telnet command is used for logging into a remote system. The telnet
command presents the same logi n and password prompts as done on a local
system.

2.3.2 rlogin command

The rlogin command is used to connect to a remote computer. It is
comparatively easier to use then telnet. Here is the syntax of rlogin command:

rlogin [-l username] hostname

In this the username is taken by default the username of the current user.
Hostname is the name of the UNIX machine that is to be logged on.


2.4 File Management

A file is a basic data storage entity i n a UNIX system. There is a set of
commands that can be used to maintai n this system. We wi ll be havi ng an
introductory flavor of these commands in this chapter with the complete
discussion being taken up in the chapter on file system. Readers are advised
to have a look at the man pages of each of these commands and try to
understand what exactly these commands are used for.

COE Unit 1, Lesson 2

23
2.4.1 mv command

The mv command moves a fi le. The command can also be used to rename a
file. Here is a simple example of mv command.








2.4.2 cp command

The cp command copies a file. Here is a simple example of the cp command.







2.4.3 rm command

The rm command removes a file. Here is an example of the rm command.








2.5 A communication related command - ftp

The ftp (file transfer protocol) command is used for copying fi les from a
remote computer to another computer. While mv and cp works on the same
system at a time you might need to get files from across systems at the same
time ftp can be used for that.

In the example below we can see how ftp can be used to connect to a remote
machi ne. In this example user achi nt gets file from machine mitserv.
bash> ls
tempPresentation.txt
bash> mv tempPresentation.txt
finalPresentation.txt
bash> ls
finalPresentation.txt

bash> ls
tempPresentation.txt
bash> cp tempPresentation.txt
finalPresentation.txt
bash> ls
tempPresentation.txt finalPresentation.txt

bash> ls
tempPresentation.txt finalPresentation.txt
bash> rm tempPresentation.txt
bash> ls
finalPresentation.txt

COE Unit 1, Lesson 2

24













The ftp prompt provides few limited commands as listed below:

bin Changes the file transfer type to support the binary image transfer.
get Used to get the files from remote machine
mget- multiple get commands
ls Used to list the contents of a directory on a remote machine
cd Used to change directories on the remote machine
pwd Used to get the present working directory on remote host
lpwd Gi ves the current working directory i n local host.


2.6 Information

The information UNIX commands, regarding other users, disk quota and other
thi ngs can be retrieved using some of the UNIX commands. In this section we
will be discussi ng about some of these commands.

2.6.1 man command

UNIX traditionally provides the manual pages (called man pages) for all the
built-in commands and for system calls.

You can learn a lot by referring to the manual pages for commands.

The general syntax of the command is

man [-] [-k keywords] topic/command

The example below shows a part of the manual page of du command.






bash> ftp mitserv
Connected to mitserv
Name: achi nt # User types his
login id
31 Please specify the password.
Password: # password will not
be visible
230 Login successful.
Remote system type is UNIX.
ftp> get myPresentation.txt # Now you are in ftp.
See the prompt
250KB data transfer successful
ftp> quit
bash> # You are out of ftp
now.
bash> man du


COE Unit 1, Lesson 2

25
2.6.2 du Disk usage

This command is used to find out how much disk space is been occupied at
present by the files and directories of the user.

2.6.3 df Disk free

The df command tells how much disk space is left which can be used.

2.6.4 quota

This command is used for knowing as to how much disk space the files are
occupying on the file system.

2.6.5 who Finding out who is logged on

The who command displays the information like the usernames, terminal IDs
and process IDs of other users and processes running on the computer.

General syntax of the command is:
who [-q] [am i]

Followi ng example shows the output of who command.









Self-Check Questions

1. The commands below are used to connect to the remote computers:
i. telnet
ii. rlogin
iii. rm
2. It is not possible to logon to another machine with another username by any
means. (True/False)
3. If some files are needed to be transferred from a remote location to the current
location, we can use the ________________ command for this purpose.
4. If a user needs to know the usage of the write command, he can use the
____________ command to know how the command works.
5. There is a restriction on the usage of the disk space by a user or a group on the
UNIX system and this disk space restriction can be found by usi ng the command
_____________.
6. To know as to how much total disk space your fi les and directories have taken,
issue __________ command.
bash> who
singhs :0 May 28 14:05
achint pts/0 May 28 14:06 (lx-ptiwari:0.0)
anmol pts/1 May 28 14:12 (lx-ptiwari:0.0)
COE Unit 1, Lesson 2

26
7. On a multi-user system, there are more than one people logged onto a machine
and this sometimes chokes that machi ne off. To get i n information as to who all
are logged onto the machine we can use ______________ command.



2.7 Printing

UNIX provides commands that for printing documents. Additionally, it is
possible to control the printer queue and also to kill the processes if required
to cancel the pri nti ng job.

2.7.1 lpr Printing

This command can be used to pri nt some text i n a file. This is used to specify
a printer otherwise it issues a print job to the default pri nter set by the user.

2.7.2 lprm Removing a printing job

The lprm command can be used to cancel the print jobs that have been
queued or printing. It can be used to cancel pri nti ng jobs on the specified
printer or to cancel the job on the default pri nter.

2.7.3 lpq Checking the printing queue

This command shows the printer queue status on the named pri nter. Jobs
queued on the default desti nation will be shown if no pri nter or class is
specified on the command-line.


2.8 Process Control

When you run a program in UNIX, the programs copy starts to run. This
running program copy is called a process. The concept of process is
fundamental to UNIX OS. So, you should fi nd out and understand details
about processes. If you run the same commands twice, each time a new
process is started.

Every process is identified by a unique process ID and this ID can be used to
refer to this process or to perform any further operations on the process, like
killing the process. We will have a look at the commands which can be used
to control the processes.

2.8.1 ps Finding the process

This command is used to list all the processes being run on the machi ne.




bash> ps ef
PID PPID User Process
233 230 achint ls l
345 342 anmol ps ef



COE Unit 1, Lesson 2

27


2.8.2 & - Running process in background

By put & at the end of any command, that command runs i n the background.
Time consuming commands can be put into background so that you can
conti nue working on the same termi nal.

2.8.3 Cntrl-z Suspending a processes

If some command is by mistake issued and you want to suspend this
command and do somethi ng else first. Then you can use Cntrl-z to suspend
this process and get the CPU free for some other more important work.

2.8.4 Jobs Finding the process in background

To find the processes running i n the background you can use the jobs
command. This is different from the ps command.

2.8.5 Kill Killing a process

If some process is runni ng for long time or is produci ng some unwanted
results you can use the kill command to kill the process.

The syntax of command is
Kill [-signal] [process id]

Sometimes a process may still not get killed and you still want to kill it, you
can send the -9 signal to kill it.

2.8.6 nice reducing the priority of process

This command can be used to reduce the priority of a command and let other
commands run earlier than the command.

The syntax of command is
nice command [command option]



Self-Check Questions

8. If a pri nt job is fired it is not possible to abort the pri nti ng. (True/False)
9. To know as to what all are the print processes that are at the pri nter in queue, we
can use ____________ command.
10. To print some text in a file, use ______________ command.
11. To change the priority of a job we can use the _________ command.
12. If some process is fired which is not required at the moment and we need to fire
another process, then we suspend the process using _______________
command and conti nue with the process later on.
COE Unit 1, Lesson 2

28
13. If it is required to know the processes runni ng on to the system then we wi ll issue
______________ command.



2.9 Miscellaneous commands

Besides the other commands that we have discussed in this lesson by now,
there are numerous other commands in UNIX with lots of options which can
be used to perform some amazing tasks. We will be discussi ng some of these
commands with useful and common options that are used. For other options
readers can refer the man pages of these commands.

2.9.1 alias / unalias command

To create or remove an alias for some command these commands are used.
The example shows the use







2.9.2 cal (calendar) command

This command displays the calendar.

2.9.3 clear command

This command clears the screen

2.9.4 crontab command

It is sometimes required to run some commands at a specific date and time.
For this purpose crontab command can be used. See man crontab for see
details. The cron (see man cron) maintains a file which is managed usi ng the
crontab command. This file contains the information about the command and
the time and date of the execution of the command. Here is an example:






2.9.5 csh command

This command is used to run the C shell or to execute a C shell script.
The syntax for this command is
csh [filename]
bash> alias rm rm i
bash> unalias rm
Creates an alias rm which calls rm i
Now rm will call rm command
bash> crontab l
0 0 * * 5 echo This is a cron | mail john
Contents of crontab file.

COE Unit 1, Lesson 2

29

2.9.6 history command

This command is used to list the commands that you have typed so far.

2.9.7 date command

This command prints the system date and time. The date command has many
formatting arguments. See man date for details.





2.9.8 echo command

This command echoes back string gi ven to it.




2.9.9 grep command

This command is used to search a pattern in a file. We will see more details
on grep command i n subsequent chapters. Here is a simple example.





2.9.10 unset command

The unset commands removes a shell variable.

2.9.11 tar command

This command is used to create an archi ve of files or to extract files from an
existing archive. See man tar for details.

2.9.12 tee command

This command copies text from a pipe into a file. See man tee for details.

2.9.13 touch command

This command changes the date and time of a file without changing the files
content. The touch command creates a file if not exiting.
bash> echo My name is achi nt
My name is achi nt
bash> grep goto file.c
/*You should not use goto in c programmi ng */
bash> date
Friday 25 Jan 2008
COE Unit 1, Lesson 2

30


Self-Check Questions

14. An ____________ is a short command or word that points at some path, or
absolute command name.
15. To change the date and time stamp on a file without readi ng the file __________
command can be used.
16. To get the text from a pipe into a file ______ command can be used.



2.10 Summing Up

UNIX provides a rich set of commands for file management, pri nti ng, process
control, etc.


2.11 Answers to the self-check questions

1. telnet, rlogin.
2. False.
3. ftp
4. man.
5. quota
6. du.
7. Who
8. False
9. lpq
10. enscript
11. nice
12. cntrl-Z
13. ps
14. alias
15. touch
16. tee


2.12 Terminal Questions

1. Define and explai n the various command classes
2. How is communication handled in UNIX? What is FTP?
3. Describe how File Management is implemented in UNIX
4. List the commands and their usage for various commands used in process
control
5. Explai n the various print commands in UNIX

COE Unit 1, Lesson 3


LESSON 3 UNIX FILE SYSTEMS

3. UNIX FILE SYSTEM ....................................................................................................... 33
3.0 OBJECTIVES ............................................................................................................ 33
3.1 INTRODUCTION ........................................................................................................ 33
3.2 FILES ....................................................................................................................... 33
3.2.1 Filenames .......................................................................................................... 33
3.2.2 Filename Extensions ....................................................................................... 34
3.3 DIRECTORIES .......................................................................................................... 34
3.4 FILE TYPE ................................................................................................................ 34
3.4.1 Links ................................................................................................................... 35
3.4.2 Special Files ...................................................................................................... 35
3.5 PATH TO A FILE ........................................................................................................ 36
3.5.1 The root directory ............................................................................................. 36
3.5.2 Absolute Path.................................................................................................... 36
3.5.3 Relative Path ..................................................................................................... 36
3.6 MANIPULATING FILES .............................................................................................. 36
3.6.1 Moving and Renaming Files and Directories ............................................... 36
3.6.2 Copying files and directories .......................................................................... 36
3.6.3 Removing Files and Directories ..................................................................... 37
3.6.4 Creating a directory.......................................................................................... 37
3.6.5 Listing the files .................................................................................................. 37
3.7 FILE PERMISSIONS .................................................................................................. 38
3.7.1 File Permissions ............................................................................................... 38
3.7.2 Permissions for directories ............................................................................. 39
3.7.3 Changing the permissions on the file ............................................................ 39
3.8 CHANGING FILE OWNER AND GROUP .................................................................... 40
3.9 FILE SEARCH ........................................................................................................... 40
3.10 VIEWING BEGINNING AND END OF A FILE................................................................ 40
3.11 ANSWERS TO THE SELF CHECK QUESTIONS ........................................................... 41
3.12 TERMINAL QUESTIONS............................................................................................. 42
3.13 SUGGESTED READING MATERIAL ........................................................................... 42
COE Unit 1, Lesson 3

33
3. UNIX File System




In the UNIX operati ng system the basic storage block is known as a file. This lesson
focuses at understanding the concepts of file manipulation and handling.



3.0 Objectives

After goi ng through this lesson, you will be able to

Understand the basic concepts of fi les and directories
Understand the paths and pathnames i n UNIX systems
Understand the UNIX file types
Understand the basic UNIX commands related to the fi le system
Understand the file manipulation and file security


3.1 Introduction

In a UNIX operating system the basic structure that stores data is known as a
file. You can store data of any format in a file. Multiple files can be put
together in a directory. Apart from contai ning files, a directory can contain
other directories as well. A directory that is inside another directory is called a
subdirectory.
A file is analogous to a notebook. A directory is analogous to a bag that
contai ns files.


3.2 Files

A file contains a sequence of bytes stored on a storage device, such as a
disk. On the disk the file is not necessarily stored on a single sector but can
be scattered on the disk The OS, keeps track of the information that belongs
to a specific sequence of data.

3.2.1 Filenames

Each file has a name. Any name can be given to a file. The name of a file can
be changed anytime. Unlike windows, UNIX fi le names do not contain spaces.

An important thing to remember here is UNIX is case sensitive. Which means
A is different than a, so one should be very careful while using the cases for
separating the file names. So, myfile.txt and myFile.txt are different files.
COE Unit 1, Lesson 3

34

3.2.2 Filename Extensions

UNIX does not enforce any specific extensions on file names. This is unlike
Windows where extensions are used to invoke applications directly.

In UNIX you can choose any extension for your files. Even multiple extensions
are permitted (e.g.,data,tar.gz). Also files need not always have extensions
(e.g., myFileOf24Dec2007).

Since it is possible to not gi ve extensions, one can create files where
extensions are misleading. For example, myProg.db may be a C program
whi le myData.cpp may be contai ning simple text data. Obviously this is not
desirable and one must be careful in putti ng proper extensions.

Though UNIX itself does not enforce any extensions, there are many
important uti lities/programs that expect a specific file extension. For example,
the C compiler expects files with .c or .h extensions.


3.3 Directories

Files are kept in directories. Directories are the groups of files in some logical
structure totally dependent on the application and the user requirements. A
directory can contai n files and other subdirectories.
The figure below shows how the directory myData contai ns subdirectories
which i n turn contains the files.







Each directory in UNIX contains two special subdirectories:
./ (The dot directory) This i ndicates the current directory itself.
../ (The dot dot directory) indicates the parent directory of current directory.







3.4
myDat
a/
Investmen
ts/
Official
/
RBI
Bonds
ICI
CI
Reports

Sal
es
pla
n
custo
mers
bash> pwd
Investments
bash>cd ..
bash>pwd
myData


My name is achint
Shows current directory as Investments/

Current directory after cd .. is myData/ (the parent)

COE Unit 1, Lesson 3

35
File Type

Regardless of the data contai ned in a file, UNIX associates a file type for each
file. There are 4 file types - ordi nary files, directories, links and special files.

Ordinary file is any file that you commonly use. These include text files,
executable programs, shell scripts, etc. Also, we have already see what are
directories. Lets now see li nks and special files.

3.4.1 Links

A link is not a file but it is a second name to a file. Sometimes linki ng files is a
good option over copying because once copied, the copies can be changed
differently. On the other hand if you create a link then there is actually only
one copy of the file. A li nk is created using the ln command of UNIX. There
two types of li nks, soft link and hard link. See man ln for more details.

3.4.2 Special Files

UNIX represents even devices with files. These files are special files. For
example, the audio output is typically /dev/audio file. What can you do with
such a special file? Well, you can write into it or read from a special file and
UNIX hides the details on how it is actually worki ng with the device. For
example, you can simply cat a music file to /dev/audio and it will be played!



Self-Check Questions

1. IT is possible to have multiple filename extensions i n a file in UNIX. (True/False)
2. It is required to have a filename extension in a file in UNIX, which signifies the
properties of that file. (True/False)
3. Filename work and Work points to the same file in a UNIX file system.
(True/False)
4. Directories acts as a categorization structure of the data in a UNIX file system.
(True/False)
5. __________________ is a directory under the parent directory, which can be
used for the categori zation of data further down the hierarchical file structure.
6. Which is not a UNIX file type?
a) Links
b) Symbolic Li nks
c) Program files
d) Directories
7. A ______________ (soft/hard) is only a text file that points to some other file
somewhere i n the file system and does not contains the data.




COE Unit 1, Lesson 3

36
3.5 Path to a file

3.5.1 The root directory
UNIX OS treats the directory / as the root directory. The root directory is the
ultimate parent of all other directories on a UNIX system.

3.5.2 Absolute Path

Every file on a system has a path that starts from the root.
For example,






.
The pwd command always lists the absolute path.

3.5.3 Relative Path

When i n a directory, if you know the relati ve position of a file, you need not
access that file usi ng absolute path. You can simply use the relati ve path to
the desired file as well. This is shown in an example below:
You can also access files usi ng relati ve paths. For example,



3.6 Manipulating Files

The file manipulation operations are file deletion, file renami ng and moving
files from one location to another.

3.6.1 Moving and Renaming Files and Directories

The mv command of UNIX moves files and directories to specified locations.









3.6.2 Copying files and directories

bash> pwd
/dtu/It_Courses/IT_999
bash> ls ../IT-102/schedule.txt



This is the relative path of
schedules.txt with respect to
/dtu/It_Courses/IT_999
bash> mv i data data.old
bash> mv i data new
bash> mv i oldDir newDir
Moves data to data.old
Moves data i nto new/ directory
Moves oldDir to newDir
bash> pwd
/dtu/IT_Courses/IT_101/schedules.txt
This is the absolute path to the schedules file
COE Unit 1, Lesson 3

37
The cp command of UNIX copies files and directories..









3.6.3 Removing Files and Directories

Often you want to fi les or some directory (i ncludi ng its contents). For example
you may be cleani ng your system. The rm command deletes files and
directories.






Be careful with rm command. A fi le or directory once deleted cannot e
undeleted in UNIX. There is no such thi ng as trash can in UNIX. It is advisable
to use the i option of rm command all the time. See man rm for details.

If a directory is empty, then it can be deleted using rmdir command. See man
rmdir for details.

3.6.4 Creating a directory

The mkdir command creates a new directory.








3.6.5 Listing the files

The ls command of UNIX lists files and directories in the current directory. lt
has a large number of other options (see man ls).
bash> rm file.txt my.txt

bash> rm f file.txt

bash> rm r directory1


Removes specified
files.
-f option indicates that rm will not give
error even if file given to be deleted
does not exist.
bash> mkdir project
bash> mkdir /home/anmol/data
bash> mkdir ../../myDir


bash> cp old new

bash> cp R /home/joe/bread /home/jam/food



Copies file old to new. Overwrites new if exists.
Copies all files and subdirectories to the target
directory
Will create directory project/
Absolute path can be given to create a dir
Relati ve path can be gi ven

-r option indicates delete all subdirectories as well.
COE Unit 1, Lesson 3

38












Self-Check Questions

8. The __________________ is the parent directory of all types of directories in the
UNIX file system.
9. The name of file starti ng from the root directory is called the _____________
pathname of the file.
10. The relative pathname of a file is the name of the file with respect to the parent
directory. (True/False)
11. Pick the odd one out
Followi ng operations can be performed on the file system
a) Building
b) Listing
c) Renaming filenames
d) Copying
12. On using the mv command from one file to an existing file it ___________
(appends/overwrites) the contents of the moved fi le onto existing file.
13. To copy one directory to the other it is mandatory to use the option _______ with
the command cp.
14. Command rmdir can be used to delete the complete hierarchical directory
structure. (True/False)



3.7 File Permissions

UNIX enforces permissions for fi les and directories. If you are the owner of a
file, you can put permissions whether the file should be readable by others or
not, and so on. Lets see more details about file permissions.


3.7.1 File Permissions

The user of the UNIX file system can belong to three classes:

The owner of the file
The group which the file belongs to
Other users
bash> ls -l
drwxr--r-- 1 achi nt editors 4096 drafts
-rw-r--r-- 1 achint editors 30405 edition-32
-r-xr-xr-x 1 achint editors 8460 final_draft

This field explains file permissions and file
type the fields are explai ned in table below
achint is the file owner.
editors is the group. Size is
8460 bytes

COE Unit 1, Lesson 3

39



















3.7.2 Permissions for directories

For the directories read permissions enables the user to list the contents of
the directory; Write permissions allows the users to create a fi le or a directory
inside that directory and execute permissions allows to change the present
working directory to that directory.

3.7.3 Changing the permissions on the file

The chmod command changes the permissions for a file and directory. See
man chmod for details. There are several ways to change the permissions of
a file. Here are few examples:












There is another form i n which the permissions can be directly set for the files
by using an octal code. With three-digit octal notation, each numeral
represents a different component of the permission set: user class, group
class, and "others" class respecti vely.

For example, the number 764 in octal can be represented as followi ng in
binary 111110100.
bash>chmod ug+r w sample
bash> ls -ld sample
drw-rw---- 2 achint editor 96 Dec 8 12:53 sample

bash> chmod a-rwx sample
bash> ls -l sample
---------- 2 amol editor 96 Dec 8 12:53 sample


Permits user and group to read and write
in file
Removes permissions for all
bash> ls -l
drwxr--r-- 1 achi nt editors 4096 drafts
-rw-r--r-- 1 achi nt editors 30405 edition-32
-rwxr-xr-- 1 achi nt editors 8460 final_draft





-rwxr-xr--
First letter:
- means
ordinary file
d means
directory
l means its a
link

These 3 letters
indicates file
readable, writable
and can be executed
by the owner.
These 3 indicate
group people can
read/execute but
cannot write into
this file
These 3 indicates
others can only
read this file.
COE Unit 1, Lesson 3

40
The first octal digit when converted to binary represents the permissions for
owner (7 in octal is 111 i n binary which implies rwx for owner).
The next octal digit when converted to binary represents the permissions for
the group (6 in octal is 110 i n binary which implies rw- for group).
The last octal digit when converted to binary represents the permissions for
the others (4 i n octal is 100 in binary which implies r-- for other).


3.8 Changing File Owner and Group

The chown command changes the owner of a file. See man chown for details.

The chgrp command changes the group of a file. See man chgrp for details.

3.9 File Search

The find command helps in locating files and directories. This is a powerful
command and has lots of options. See man find for details. Here is the syntax
of the find command.




The find command searches through the contents of one or more directories
including all of their subdirectories.






Another example i n which same file name is searched i n two directories:






.


3.10 Viewing Beginning and End of a file

UNIX provides commands using which it is possible to display the contents of
the start or end of the file. These are head and tail commands.

head Start of the file
tail end of the file

bash> find / -name schedule -print
/dtu/IT_courses/IT_101/schedule
/dtu/IT_courses/IT_102/schedule
bash> find . type d name abc -print

Finds all the files in / named
schedule
Finds directory abc and not file in the present directory
find search_directory name file_name [-pri nt]

COE Unit 1, Lesson 3

41
Example usage








Self-Check Questions

15. Pick the odd one out
The users in a UNIX file system can be categorized as:
a) Owners
b) Group
c) Friends
d) Other users
16. To change the fi le permissions from one set to another, the command
___________ can be used.
17. __________________ command is used to change the owner and the group of
the file.
18. The _______ command lets you search for files and directories.
19. The _______ command will be useful to show the last few li nes of a file.



3.11 Answers to the self check questions

1. True
2. False
3. False
4. True
5. Subdirectory
6. Program files.
7. Soft li nk
8. Root.
9. Absolute path..
10. True
11. Building
12. overwrites.
13. r
14. False
15. Friends
16. Chmod
17. Chown, chgrp
18. Find
19. tail


bash> head n 10 file


Shows the 10 starting lines of file
COE Unit 1, Lesson 3

42
3.12 Terminal questions

1. Write a detailed note about the hierarchical file structure.
2. Explai n briefly the manipulati ng operations possible on the file structure
3. Write a brief note on the permissions on the files and directories in UNIX.
Also, explain how we can change permissions of the files in UNIX using the
chmod command. Use some relevant examples to explain the concepts.
4. Explai n the UNIX system fi le types, also explain the salient features of each
file type


3.13 Suggested Reading Material

1. Uni x Programmi ng Environment, by Kernighan and Pike.
2. Design of Unix Operati ng System, by Maurice J. Bach


COE Unit 1, Lesson 4


LESSON 4 THE VI TEXT EDITOR

4. THE VI TEXT EDITOR.................................................................................................... 45
4.0 OBJECTIVES ............................................................................................................ 45
4.1 INTRODUCTION ........................................................................................................ 45
4.2 FILES CONTAIN STREAM OF CHARACTERS .............................................................. 45
4.3 HOW VI HANDLES THE FILES ................................................................................. 46
4.4 INVOKING VI ............................................................................................................. 46
4.5 MODES OF VI ........................................................................................................... 46
4.5.1 Command mode ............................................................................................... 46
4.5.2 Edit mode........................................................................................................... 46
4.5.3 Switching bet ween command mode and edit mode ................................... 47
4.6 POSITIONING TEXT ON THE SCREEN ...................................................................... 47
4.6.1 Scrolling and moving the Screen ................................................................... 47
4.6.2 The GOTO Command ..................................................................................... 48
4.6.3 Searching........................................................................................................... 48
4.7 POSITIONING THE CURSOR : H, L, J, K COMMANDS................................................. 48
4.8 EDITING USING SCOPES .......................................................................................... 49
4.8.1 Delete Text (d, D) ............................................................................................. 50
4.8.2 Change Text (c, C) ........................................................................................... 50
4.8.3 Replace Command (r, R) ................................................................................ 50
4.8.4 Erase Command (x, X) .................................................................................... 51
4.8.5 Undo Command (u, U) .................................................................................... 51
4.9 TEXT INSERTION...................................................................................................... 51
4.9.1 Append Command (a, A) ................................................................................ 51
4.9.2 Insert Command (i, I) ....................................................................................... 52
4.9.3 Open Command (o, O) .................................................................................... 52
4.9.4 Read Command (:r) ......................................................................................... 52
4.10 GLOBAL SEARCH AND REPLACE FOR TEXT ............................................................ 52
4.11 REARRANGING AND DUPLICATING TEXT................................................................. 53
4.11.1 Copying Text and Moving the Copy .............................................................. 53
4.11.2 Deleting Text and Moving It ............................................................................ 54
COE Unit 1, Lesson 4


4.12 NAMED BUFFERS .................................................................................................... 54
4.12.1 Using the named buffers ................................................................................. 55
4.13 MISCELLANEOUS INFORMATION.............................................................................. 56
4.13.1 Creating Line Numbers ................................................................................... 56
4.13.2 Lines and Sentences in VI .............................................................................. 56
4.13.3 Joining Lines ..................................................................................................... 57
4.13.4 Repeating a Command ................................................................................... 57
4.13.5 Editing Multiple Files Using vi......................................................................... 57
4.13.6 Mark Command ................................................................................................ 58
4.14 SAVING OR STORING A FILE.................................................................................... 58
4.14.1 Writing to the file ............................................................................................... 59
4.14.2 Exiting the vi editor ........................................................................................... 59
4.15 SUMMING UP ........................................................................................................... 60
4.16 ANSWERS TO THE SELF-CHECK QUESTIONS ........................................................... 60
4.17 TERMINAL QUESTIONS ............................................................................................ 61


COE Unit 1, Lesson 4

45
4. The VI Text Editor




When you write programs, scripts or modify data, write mails, etc., you will need to
use text editor. This lesson focuses on the VI text editor; one of the most commonly
used text editors in UNIX systems.



4.0 Objectives

After goi ng through this lesson, you will be able to

Understand how to open and edit files usi ng vi
Understand various text insertion and deletion methods in vi
Understand the basic structure of vi text editor
Understand the commands to edit text using vi and scopes
Understand miscellaneous other features of vi


4.1 Introduction

vi is a visual, non-graphical and interacti ve text editor which allows a user to
create, modify, and store files on the computer.

Note that i n this chapter, the cursor is shown by putting an underscore for a
character. For example: The cursor is at the letter n i n the following li ne.
This is a line.
There's an editor out there that programmers have been usi ng to edit their
programs for the last 24 years. It's called vi (say vee-eye) and it is it is quite
powerful.

http://www.websiterepairguy.com/articles/vi/12_learn_vi.html


4.2 Files contain stream of characters

When you type characters or numbers, etc. each key goes as an ASCII
character. For example, a gets recorded as ASCII 97. When you write lines
like these
This is li ne 1
This is li ne 2

These lines are stored as a stream of characters like This is line 1 \nThis is
line 2. Here the \n is a special character which signifies a new li ne.
COE Unit 1, Lesson 4

46





4.3 How Vi Handles The Files

When you open a file i n vi, the file contents are read into a buffer. All text
editing jobs are done in memory as the buffer. The file on the disk is not
updated unless vi is explicitly asked to save the changes. This gives an option
to change the content of the buffer until you are not satisfied without changing
the file on the disk.

4.4 Invoking vi

The vi editor can be invoked using the following command




The figure below shows how the file looks when opened i n vi.




4.5 Modes of vi

vi has two modes i n which you will work.

4.5.1 Command mode

The command mode is the default mode. All vi commands work only in the
command mode. In the command mode you cannot write text. You can only
move around in the text, delete text, modify existi ng text, search for text, etc.

4.5.2 Edit mode

In edit mode you can add new text in vi. In edit mode you cannot use any
commands to search or navigate in the text.


~
~
~
~
.
.
myfile [new
file]
The cursor
Tile(~) i n vi represents an
empty line.
File
information
$ vi demo.txt

COE Unit 1, Lesson 4

47

4.5.3 Switching between command mode and edit mode

When in command mode, few commands take you to edit mode. For
example, in the command mode, if you press i, you will get to the edit mode
and can add text.

When in the edit mode, you can stop editing further and go to the command
mode by pressing the <Esc> key.


4.6 Positioning Text on the Screen




vi provides several ways to reach the text you want to edit in a fi le.

4.6.1 Scrolling and moving the Screen

By scrolli ng the screen we can reach the text desired. The table below
explai ns how one can scroll the screen.

Command Resulting Action
Cntrl+u Moves wi ndow upwards one complete screen
Cntrl+d Moves wi ndow downwards one complete screen
H Takes cursor to the top of the screen
L Takes the cursor to the bottom of the screen
M Takes the cursor to the middle of the screen

All these commands work only in the command mode.
Cursor is at same position
but edit mode has started
now press d
Press
esc
Now you are i n
command mode
You are in command
mode and cursor is at a.
press i
This is a
line
This is a
line
This is da
line
This is da
line
Cursor is at letter a and
letter d is added.
COE Unit 1, Lesson 4

48

4.6.2 The GOTO Command

Sometimes you already know the line number where you want to reach. You
can use the GOTO in such cases. The table below explains the command and
the resulting action.

Command Resulting Action
G Moves cursor to the last line
<N>G
Like 33G
Moves the cursor to the Nth line
:<N>
Like :65
Moves the cursor to the Nth line

4.6.3 Searching

It is also possible to search for a pattern and by this the screen will be moved
to the occurrences of the desired pattern.

Here are the commands that work for search i n vi..

Command Resulting Action
/pattern Searches the pattern forward from current
cursor position
?pattern Searches the pattern backward from current
cursor position
:set ic This makes the subsequent searches case
insensiti ve (ic in set ic stands for ignore case)
:set noic This makes the subsequent searches case
sensitive

Once you start a search you can repeat the search i n a simple way. On
keying in n vi goes to the next instance of pattern in the file and using N it
searches i n opposite direction.


4.7 Positioning the Cursor : h, l, j, k commands

This section explai ns fi ner control of the cursor.

You can move the cursor by use of "arrow" keys. You can also use the
"direction" keys "h" (move left by one character), "j" (move down to next li ned),
"k" (move up to previous li ne), and "l" (move right by one character).

The "RETURN" key is similar to the "j" key in that it moves the cursor down
one line. However, the "RETURN" key always positions the cursor at the
beginni ng of the next line; whereas, the "j" key moves the cursor straight down
from its present position, which may be the middle of a line. Movi ng several
spaces may be accomplished by repeatedly pressi ng the "RETURN", direction
COE Unit 1, Lesson 4

49
or arrow key; such as, "k" "k" "k" to move upward 3 li nes. You can also
precede any of these keys with a number and achieve the same results, "3k".



Self-Check Questions

1. If in a file cursor is resting at the 34 li ne and it is desired to be placed onto the 74
line then the command that is to be issued is _____________G.
2. On searchi ng with ? and /, the search respecti vely will be done
______________ and ____________________. (backwards/forward).
3. To get the file statistics using the VI editor the command required to be issued is
___________.
4. On keying in N while searching for a pattern using ? the cursor will reach the
next instance of the pattern ________________. (backward/forward)
5. To move to the 25 word in the line while the cursor is on 18 line the command
that can be issued is ___________.
6. To move to the beginning of the line on which the cursor is residing in a text file
the command that can be issued is __________.
7. The vi editor sets or creates a temporary buffer area while editing a file which is
stored on the disk and is used later on for the reference purpose by the editor.
(True/False)




4.8 Editing using scopes

vi commands have scope built into them. For example, when you say dd
then first d indicates the delete operations and the second d tells it to apply
the command on a line. Similarly, yy yanks a line. But the commands like d
and y can be gi ven a scope and VI commands also have upper case
versions.

Scope Text Unit Encompassed
0 Beginning of line
$ End of li ne
W w Word right
B b Word left
E e End of word right


With the scopes we can use the operators to get more powerful outcomes.
We can further do editing very much locally usi ng the combi nation of the
operators and scopes. In this section we will discuss this combi nation.
COE Unit 1, Lesson 4

50

4.8.1 Delete Text (d, D)

The delete command is used in command mode to remove portions of text
from the file being edited. The scope must be specified after the delete
operator. Some of the most common scopes used with the delete operator
shown in the next table.

Delete
operator
and
scope
Resulting Action
dw Delete word forward
D( Delete complete sentence backward
d) Delete complete sentence forward
dG Delete from current line to end of file
dL Delete from current line to end of screen
d/^xyz Delete from current line to first occurrence of
pattern
dtx Delete from current place to first occurrence of x

NOTE: The same scope prefixes can be used with all the scoped text editing
commands so we will not discuss them with any further commands but
different scopes or operators, if any will be discussed.

NOTE: It is important to remember that the current cursor position serves as
the starting point for the scope. This means if you do scoped deletion, it wi ll
happen starting from the current poi nt. For example, typi ng "2dd" will delete
two consecuti ve lines beginning with the current line.

4.8.2 Change Text (c, C)

You can use the change command to change the text in a li ne. Scopes are
applied in the same manner as they are used with the delete command.

On issuing the change text command, vi gets into the edit mode and after the
text insertion on issuing the <ESC> key it returns to the command mode. The
example shows how change command can be used.



4.8.3 Replace Command (r, R)

The replace command is used to replace portions of text on the screen. The
table shows the two variants of the replace command and their usage for
replacing text.
This is the line to watch





This is new line to watch
Cursor is positioned
att
Text inserted in place of
two words
On issui ng the command
2cw or change two words
and keyi ng in new li ne
COE Unit 1, Lesson 4

51

Replace
command
Text replacing action
r Used to replace a single character at a time
R Used to replace as many characters as
there are keystroke until user issue <ESC>



4.8.4 Erase Command (x, X)

The erase command removes a character.

Erase
Command
Erase Action
x Erase character on which cursor is
placed
X Erase character left to cursor

4.8.5 Undo Command (u, U)

Undo command reverses the effect of the editing operations done on a file.

u reverses the effect of last editing command whereas U reverses the effect
of all the editing operations on the file since last save.


4.9 Text Insertion

vi editor provides several ways to i nsert the text i n the file. We wi ll be
discussing each of these methods in some detail but it is advisable for a newly
inducted candidate to take up one approach and use that to i nsert the text.

4.9.1 Append Command (a, A)

It is used to add to the existing text. It has two forms a and A. These two
forms are explai ned in the figure below.
This is the line to watch out for.



This is the mine to watch out for.




This is the kite to watch out for.


Cursor positioned at l
On issui ng r command and
typi ng m
l is replaced by m
On issui ng R command,
keying in kite and <ESC> Complete word is
replaced
The student laughed.



The students laughed.




The students laughed. Aloud.


On issui ng a command and typing s and <ESC>
Text appended after the cursor
COE Unit 1, Lesson 4

52

4.9.2 Insert Command (i, I)

This command is used to insert the text into a text file. This command has two
forms i and I. In the figure below it is explai ned how to use this command.


4.9.3 Open Command (o, O)

Open command opens a new line to add text. This has two forms o and O,
in the figure below the usage is explained.


4.9.4 Read Command (:r)

The read command is allows the user to copy of another file into the current
file. While in command mode and with the cursor on the line above where you
want the special file read in, type:





4.10 Global Search and Replace for text

:r <File>

Reads the file specified at cursor location i n the current file
Text appended at end of line.
The student laughed.



The new student laughed.




Again The student laughed.


On issui ng i command and typing new
and <ESC>
On issui ng I command and typi ng
Againand<ESC>
Text inserted before the
cursor
Text appended i n the
beginni ng of li ne.
The student laughed.



A new line is added
The student laughed.



A new line added
The student laughed.
Another line


On issuing O command and typing A new line is added and
ESC>
On issuing o command and typing Another line and <ESC>
Text inserted above the current line
Text appended in the beginning of the line.
COE Unit 1, Lesson 4

53
The example below shows different commands that can be used for searching
and replacing with different purpose.





Self-Check Questions

8. To delete the word on which the cursor is placed D command can be issued.
(True/False)
9. The change operator i nvokes the text insertion mode. (True/False).
10. The operator _______________ changes the text, yet does that in command
mode and not i n text insertion mode.
11. The command ______________ replaces the characters on screen one at a time
as the user keys in the new characters.
12. To erase the character on which the cursor is place __________ command is to
be issued, whereas to delete the character prior to the character (left) on which
the cursor is placed _________ command needs to be issued.
13. To replace the name shahs with mazes in a text file the command to be issued
is ___________.




4.11 Rearranging and Duplicating Text

You can yank text for copying it at another place i n the text fi le.

4.11.1 Copying Text and Moving the Copy

Step 1: Copying Text with the Yank Command (y, Y)

The yank command y can be used with the scopes and similar scopes can
be used as we have seen i n delete command. Yanki ng places the yanked
content into an unnamed buffer. Some of the examples of yanking are:


:1,$s/oldText/newText/g

:1,15s/oldText/newText/g

:g/oldText/s//newText/gc






This command replaces all the
instances of oldText with
newText in the file
This command replaces
oldText with newText from line
number 1 to 15
This command asks before replaci ng text
each time
This is the line to be yanked.







This is the line to be yanked
This is another line to yank
This is yet another line that can be yanked
cursor is character l
On issui ng the
command
3yw which means yank
3 words, it yanks 3
words starti ng from
current cursor position cursor is at first line
Issui ng command 3yy will
yank 3 li nes starti ng from
current line
COE Unit 1, Lesson 4

54
Step 2: Put Command (p, P)

The put command is used to place the contents of the unnamed buffer back
into the file bei ng edited. Returni ng whole lines into the text is handled
differently than word and sentence fragments.

The lower-case "p" places the line or lines below the current li ne and the
upper-case "P" places them above the current li ne.

A handy feature of yank & put is the ability to insert copy repeatedly within the
same file. The format for this action is yank, relocate cursor, put, relocate
cursor, put, etc. until all needed copies have been placed.

4.11.2 Deleting Text and Moving It

When you delete a text, it gets yanked and thus it can be used to put in
another place i n the text.








4.12 Named Buffers

Named buffers offer another way to copy (yank) or remove (delete) text.

The unnamed buffer only saves the last deleted or yanked text. vi provides 26
named buffers (a-z) are created for your use. Named buffers allow users to
yank multiple text and put them at different places.

These named buffers remai n only for the life of the current editing session.
Once you quit vi, these buffers are no longer available.

Here are few examples of how named buffers are used.

Typing "g7yy i n command mode, implies the following:
Quote () calls for a named buffer
g gets the buffer named g
7yy implies yanki ng 7 li nes into the named buffer g.










This is the file.
It contains text.
This li ne will be deleted.
Below this it wi ll be later
on pasted.
This will be the end of
file.
This is the file.
It contains text.
Below this it wi ll be later
on pasted.
This li ne will be deleted.
This will be the end of
file
This li ne will
be deleted
using dd
command.
Currently cursor
is placed on this
line
On using the p
command the
line is placed
below the
present cursor
position
COE Unit 1, Lesson 4

55

Now, if you type gp, it implies the followi ng:
g calls for the named buffer g
gp implies paste the contents of the named buffer g.

You can append more information i nto a named buffer. When you use the
capital letter to yank into a named buffer, the yanked contents are appended
into the named buffer. For example g7yy yanks 7 li nes into buffer g, now
G3yy would yank and append the 3 lines after the already yanked 7 lines into
the buffer g.

These named buffers are not write-protected. If a named buffer contains
information and it is called a second time with its lower-case name, the
original material is over-written.

4.12.1 Using the named buffers

Once you yank contents into a named buffer g, you can paste it anywhere in
the file. If you type gp, it implies the followi ng:
g calls for the named buffer g
gp implies paste the contents of the named buffer g.

p putti ng the contents below the current line
P putti ng the contents above the current li ne

It is important to note that VI editor will not tell you which all buffers are
defined currently also it cannot tell you which buffer contain what; you must
remember the names of the buffers and what all contents they have.



Self-Check Questions

14. 1To copy 10 lines of text into an unnamed buffer 10_____ command can be
used. (Y/y)
15. The text saved i n an unnamed buffer created by yanking or deleti ng can be
placed back into the text below the current line where the cursor is placed by
using _________ command.
16. To append 5 more lines to the named buffer a, the command to be issued
is__________.
17. If a named buffer is called upon again and new information is written into it then
the new information is appended to the buffer. (True/False)
18. It is possible to get the buffer name on the basis of the content stored i n the
buffer. (True/False)






COE Unit 1, Lesson 4

56
4.13 Miscellaneous Information

In this section we will discuss about some miscellaneous i nformation which
can be used to be more producti ve in editing the files.

4.13.1 Creating Line Numbers

In vi editor by default the li ne numbers are not shown. But vi editor allows the
line number view. Command for this is:
:%nu

Sometimes depending upon the requirements it is desired that the line
numbers are seen only for the current session. To have line numbers i nserted
for the current session, type:
:set number

Immediately you will see the li ne numbers appear in your file and they will
remain until you exit the editor or type:
:set nonu

The "control s" command stops screen movement.
The "control q" command releases frozen screen.
The control l command refreshes vi screen without modifyi ng the file.

The .exrc file
There are many setup (set) commands that can be set or changed for vi. It is
advisable to put these commands i nto the ~/.exrc file so that every time vi
automatically loads these settings.

For example:

The following command will show you the available setup commands.
:set all

4.13.2 Lines and Sentences in VI

To be successful i n your editing, it is necessary to understand what the editor
considers a line and a sentence. Just for clarity, a line and a sentence are
different items to the editor. To the editor, a line begi ns on the left of a screen
and terminates at a carriage return. The carriage return is the invisible
character placed in your file every time you press the "RETURN" key. A
sentence to the editor is a string of characters of unspecified length (a few
characters to many lines) termi nati ng with the punctuation marks ., ?, !
followed by either a carriage return or two blank spaces.

bash> cat ~/.exrc
set nu # Show li ne numbers
set nows # Do not wrap file while searchi ng.
bash>


COE Unit 1, Lesson 4

57
4.13.3 Joining Lines

As you are editing files, you will find it is desirable to combine or join lines.
This is easily done usi ng the "J" (join) command. An illustration of joini ng lines
is given below. The cursor is located on the top li ne when the "J" command is
issued. vi will move the lower line and butt it to the end of the upper line. The
editor takes care of necessary spacing for you.



4.13.4 Repeating a Command

To make life a bit easier, vi allows text alteration commands to be repeated by
using the . (Repeat) command. A handy way to illustrate the repeat
command is with the cw command replacing a single word with two new
words throughout a paragraph.

In this example, the first occurrence of PU is located with the search
command PU. Then with the cursor on the P of PU, the cw command is
issued followed with Purdue Uni versity and the ESC. The n key is pressed
to find the next occurrence of PU. The cursor relocates on the P of the next
PU and all that is required to change it to Purdue Uni versity is to type .



4.13.5 Editing Multiple Files Using vi

The vi editor provides a feature which allows a user to edit multiple files by
use of the ":e" (edit) command. This ability to access multiple files without
leavi ng the editor permits a user to see information i n another file without
exiting the editor. Additionally, because files are opened within the same
editor invocation they can share the same named buffers, thereby making the
transfer of text possible between the files. When vi is invoked, a work area
called a buffer is created for editing purposes. It is into this work space that a
copy of a specified disk file is placed. The editor permits only one file copy in
COE Unit 1, Lesson 4

58
this buffer space at a time. Thus after maki ng changes to a file (delete, add, or
change), you must i nform the editor what you wish done to the current buffer
contents before you wi ll be permitted to bri ng another file into this space. You
do this by use of the ":w" (write current buffer contents to opened file), ":e!\
newfile" (toss current buffer contents, no update to opened file, and place a
copy of newly called file in buffer), or ":quit!" (Exit editor and toss buffer and
buffer contents).

When you have two files open, VI permits toggling between files by use of ":e\
#". This works because whenever VI sees the character "#" used in a
command where a filename is expected, it substitutes the "#" with the name of
the previous file.

For example if you had been in fruits then opened vegetables, the command
":e #" would return you to where you were in the fruits file. Repeat ":e #" and
you would be back i n vegetables.

4.13.6 Mark Command

The mark command sets up a mark i n vi and while editing you can go back to
the places where you had placed these marks. vi provides 26 marks which
are named a to z.

You can put a mark g i n a position usi ng a command like the followi ng:
mg
Note that the marks are not visible at all in vi. You have to remember the
marks that you have put. To go back to the marked location g, use the
following command:
g


4.14 Saving or Storing a File

As mentioned earlier, the VI text editor creates a temporary working area
which can be a copy of the existi ng file on the disk or a new file. This area is
at the disposal of the user until he saves the file. On saving the file, the buffer
is removed from storage and changes saved on to the file which gets stored
on the disk. Disk storage on the other hand gets removed with the remove
command of UNIX.

The changes made i n the buffer are not saved until you specify the command
to do so, thus it is advisable to keep on saving the work periodically. We will
discuss how to save our work periodically. Below is a schematic showing how
the work is saved on the disk.


COE Unit 1, Lesson 4

59


4.14.1 Writing to the file

It is useful and safe to save the work periodically when typing text. The :w
command writes the buffer to the file on the disk thus saving the changes.
This works in the command mode.





4.14.2 Exiting the vi editor

To exit the vi editor you can use the quit command :q. This command in
conjunction with write command leads to :wq (write and quit). To discard the
changes made you can use :q!.



Self-Check Questions

19. The text i nsertion command takes the VI control from command mode to text
insertion mode. (True/False)
20. If some text is required to be added to the current text, such that the new inserted
text is added in the end of the line on which cursor is positioned then text
insertion is invoked with the command ____________.
21. If in some application it is required that the same piece of text from one text file is
to be i nserted i n another text file, user can use the command _______________.
22. When usi ng text insertion command read :r, to switch back to the command
mode from text i nsertion mode the ESC key can be used. (True/False)
23. On issui ng the write command once in the complete session we ensure that in
that all the text inserted in the session, includi ng the text inserted after the write
command is issued, is saved. (True/False)
24. If we need to store the editing work done i n the editor, the command
___________ is needed to be issued.
25. If one fi nds out that he does not need the text he has inserted into t he editor
window in the present session, then he is required to issue ____________
command.
26. In some application it is required to create a file new from a file old with some
new text and the file old needs to be kept unchanged. The VI commands that
:w <File>

Saves the changes done in the <file>
COE Unit 1, Lesson 4

60
should be issued for writi ng the new changes is __________________ and
exiting the VI session is ____________.
27. The VI editor can operate in two modes. The mode which can let the user change
the text in the file is _____________________ mode.




4.15 Summing Up

In this chapter we have looked upon Vi text editor quantitatively. We
discussed a lot of techniques and viewed examples that can help you in
editing text files very efficiently. With these techniques at hand you wi ll be
able to learn other advanced techniques, when you work in actual
environment and situations.


4.16 Answers to self-check questions

1. 74G.
2. backwards and forward.
3. cntrl-g
4. forward
5. 7w
6. 0 (zero)
7. False.
8. False.
9. False.
10. r
11. R.
12. x, X.
13. : g/shahs/s///xyz/g
14. 10yy
15. p
16. A5yy
17. False
18. False
19. True
20. A
21. :r
22. True
23. True
24. :w.
25. :q!
26. :w <new>, :q!
27. Edit.
COE Unit 1, Lesson 4

61

4.17 Terminal Questions

1. Explai n the processes that are used for changi ng the text usi ng the VI text
editor
2. Explai n the processes that can be used to delete the text usi ng the VI text
editor
3. Write a note about the named buffers and also explain some usage with
practical examples
Write briefly about the rearranging and duplicating of text in the VI text
4. Explai n how the VI editor functions
5. What are the different modes for operati ng VI Editor? Explain in brief
6. Explai n the append, insert and quit modes of operation of VI editor.




UNIT 2: SHELL SCRIPTING

1: INTRODUCTION TO SHELL ............................................................................... 67
2. SHELL SCRIPTING AND DEBUGGING ........................................................ 85
3. CONDITIONAL STATEMENTS ........................................................................ 101
4. REPETITIVE TASKS ............................................................................................. 113
5. REGULAR EXPRESSIONS................................................................................ 133

COE Unit 2, Lesson 1



LESSON 1 INTRODUCTION TO SHELL

1: INTRODUCTION TO SHELL ........................................................................................ 67
1.1 INTRODUCTION ........................................................................................................ 67
1.2 THE SHELL: COMMAND PROCESSOR ..................................................................... 67
1.3 BASH: BOURNE AGAIN SHELL ............................................................................... 68
1.3.1 Advantages of BASH ....................................................................................... 69
1.4 REDIRECTION .......................................................................................................... 69
1.4.1 Standard Output ............................................................................................... 70
1.4.2 Standard Input .................................................................................................. 71
1.4.3 Standard Error .................................................................................................. 71
1.4.4 Combining Streams ......................................................................................... 72
1.5 VARIABLES .............................................................................................................. 75
1.5.1 Setting strings with the variable names having $ ........................................ 75
1.5.2 Types of variables ............................................................................................ 76
1.5.3 Exporting variables........................................................................................... 76
1.5.4 Using Shell Variables....................................................................................... 77
1.6 COMMAND SUBSTITUTION....................................................................................... 78
1.7 PATTERN MATCHING THE WILD CARDS .............................................................. 78
1.7.1 The * & ? ............................................................................................................ 79
1.8 THE CHARACTER CLASS......................................................................................... 79
1.9 MATCHING A DOT (.) ................................................................................................ 80
1.10 SUMMING UP ........................................................................................................... 81
1.11 ANSWERS TO THE SELF-CHECK QUESTIONS ......................................................... 81
1.12 TERMINAL QUESTIONS ............................................................................................ 82



COE Unit 2, Lesson 1



67
1. Introduction to Shell





The starti ng point for the unit on Shell Scripting is to first know about Shell. Bash is
also introduced in this chapter. In the subsequent lessons further details pertaini ng to
advanced concepts are discussed at length.



1.0 Objectives

After goi ng through this lesson, you will be able to:

Know about different types of shell
See how the shell executes commands
Understand and use Redirection, Variables, Pattern matching etc.


1.1 Introduction

The Shell i n UNIX is the program which acts as an i nterface between the user
and UNIX system. It understands the user language, i nterprets it and tells the
kernel what user wants, gets the results of the command execution from the
kernel and gets back to the user with the results which he understands. All
the wonderful things that we can perform or do usi ng the UNIX system is due
to the virtue of this program, which can understand so less code and execute
the commands and user i nstruction effectively. Shell can also be known as a
command processor it processes the instructions you issue to the machi ne.


1.2 The Shell: Command Processor

On loggi ng onto the UNIX system you encounter a prompt ($ or % or any user
custom prompt). Apparently though it seems that nothing is happeni ng, but a
program is runni ng which is waiting for your i nstructions to execute them, this
is SHELL. When a user logon the shell starts functioni ng and keeps on doing
that unti l the user logs out.

When you issue a command, the shell is the first agency to acquire the
information.It accepts and interprets user requests; these are generally the
UNIX commands we key in. The shell examines and rebuilds the command
line and then leaves the execution work to the kernel. The kernel handles the
hardware on behalf of these commands and all processes in the system.
COE Unit 2, Lesson 1



68
Users can thus afford to remain ignorant of the happenings behind the scene.
This is one of the beauties of UNIX design and phi losophy.

The shell generally is sleepi ng. It wakes up when i nput is keyed in at the
prompt. This i nput is the i nput to the program that represents the shell. Below
is the list of activities that the shell performs typically.

It issues the prompt ($ or otherwise) and sleeps till you enter a command.

After a command has been entered, the shell scans the command li ne for
some special characters (metacharacters, we will have a look further) that
have a special meaning for it. Because it permits abbreviated command li nes
(like the use of * to indicate all files, as in rm *), the shell has to make sure the
abbreviations are expanded before the command can act upon them.

It then creates a simplified command line and passes it on to the kernel for
execution.

The shell cant do any work while the command is being executed, and has to
wait for its completion.

After the job is complete, the prompt reappears and the shell returns to its
sleeping role to start the next cycle. You are now free to enter some other
command.

Note: The command at the lower levels does not know or understand the
metacharacters thus the shell has to handle and resolve them to normal
representations before they are parsed to kernel.


1.3 BASH: Bourne Again Shell

Bourne Again shell is the standard GNU shell, i ntuitive and flexible. Probably
most advisable for begi nni ng users while being at the same time a powerful
tool for the advanced and professional user. On Linux, bash is the standard
shell for common users. This shell is a so-called superset of the Bourne shell,
a set of add-ons and plug-in. This means that the Bourne Again shell is
compatible with the Bourne shell: commands that work i n sh, also work i n
bash. However, the reverse is not always the case.

To know the shell you are using, invoke the command echo $SHELL. The
output could show /bin/sh (Bourne shell), /bin/csh (C shell), /bi n/ksh (Korn
shell) or /bin/bash (bash shell).

When BASH is started, it reads its configuration files. The most important are:

/etc/profile - login time for all shelss
~/.bash_profi le login shell wi ndow for bash (eg: pri nti ng system details on
screen)
~/.bashrc non-login shell wi ndow
COE Unit 2, Lesson 1



69


1.3.1 Advantages of BASH

Bash is an shcompatible shell that incorporates useful features from the
Korn shell (ksh) and C shell (csh). It is intended to conform to the IEEE
POSIX P1003.2/ISO 9945.2 Shell and Tools standard. It offers functional
improvements over sh for both programming and interacti ve use; these
include:
o Command line editing
o Unlimited si ze command history
o Job control
o Shell functions and aliases
o Indexed arrays of unlimited size
o Integer arithmetic i n any base from two to si xtyfour

Bash can run most Bourne shell scripts without modifications.

In our course, we will work with BASH only. The formats and commands
mentioned i n this course will be slightly varied if they are to work i n different
shells.


1.4 Redirection

Many of the UNIX commands that we have came across, sends their outputs
to the terminal. There are commands which take their input from keyboard.
So, one can thi nk of that these commands are designed to accept only fi xed
sources and destinations. These commands are designed to use the
character streams without knowi ng its source and desti nation. A character
stream is just a sequence of bytes that many commands se as inputs and
outputs.

In a UNIX system these streams are dealt to be as files, and a group of UNIX
commands reads from or writes to these fi les. A command is usually not
designed to send output to the terminalbut to this file. Likewise, it is not
designed to accept input from the keyboard eitherbut only from a standard
file which it sees as a stream. Theres a third stream for all error messages
thrown out by a program. This stream is the third file.

Its here that the shell comes i n. The shell sets up these three standard files
(for i nput, output and error) and attaches them to a users termi nal at the time
of loggi ng in.Any program that uses streams will find them open and available.
The shell also closes these files when the user logs out.

The standard file for input is known as standard input and that for output is
known as standard output. The error stream is known as standard error. By
themselves, these standard files are not associated with any physical device,
but the shell has set some physical devices as defaults for them:

COE Unit 2, Lesson 1



70

Streams Default sources/destinations
Standard
Input
The default source is Keyboard
Standard
Output
The default destination is the termi nal screen
Standard
Error
The default destination is the termi nal screen

1.4.1 Standard Output

There are commands like more which sends their output as a character
stream, this stream is called the standard output stream and appears on the
terminal screen by default. By usi ng the redirection this stream can be
redirected or sent to a disk file.

Examples,

bash>more myFile > newFile

The shell looks at the >, understands that standard output has to be
redirected, opens
the file new file, writes the stream i nto it and then closes the file. And all this
happens with more knowi ng nothing about it because more sends the output
to the stream and that stream gets redirected to a disk file.

By using > redirection operator, shell wi ll overwrite and existing file and
creates a new file if no file with the name is existing. It is possible alternatively
to append to the an existi ng file by usi ng another redirecti ng operator >>

Operator Action performed
> Creates a new file or if the file is already existi ng
then overwrites
>> Appends to the file if the file is existing or creates a
new file

It is also possible to club the commands together and redirect the output to a
file. A pair of parenthesis groups the files and a redirection can redirect them
to a file.

Example,

bash> (ls l; who) > myFile

It is also possible that the results are redirected to another program, this is the
concept of pipeli ning which we will discuss later on.
Thus conclusi vely the standard output has three possible destinations:
Termi nal or the screen and it is the default destination
A disk file
A pipe to another command
COE Unit 2, Lesson 1



71


NOTE: Shell creates the fi le before it redirects the output into it .

1.4.2 Standard Input

Some commands are designed to take their i nputs also as streams. This
stream represents the standard input to the command. A classical example for
the use of the standard i nput could be the wc command for counting the
words:


With no filename provided the wc tells the user about the number of lines, number of
columns and the number of characters used and sends them to the standard output.





With some filename provided and redirected to the commands command
takes the i nput stream to be the disk file.

Conclusively we can say that the standard i nput has three possible sources:
The keyboard Used as the default standard input
The Pipe input from the results or output of some other command
The fi le i nputs from a file

NOTE: When a fi le is redirected to a command, then its the shell that opens
the file and the command does not know as to what is happening. But when
the command is used with the file name as one of the arguments then t he
command itself opens the file.

1.4.3 Standard Error

When you enter an incorrect command or try to open a nonexistent file,
certain diagnostic messages show up on the screen. This is the standard
error stream. Like standard output, it too is destined for the terminal. Note
that they are i n fact two separate streams, and the shell possesses a
mechanism for capturing them indi vidually.

Before we proceed any further, you should know that each of these three
standard
files has a number, called a file descriptor, which is used for identification:

bash>wc
2 * 4
23 ^ 64
[ctrl-d]
2 10 44 with no
filename in output
bash>wc < my
5 9 54

COE Unit 2, Lesson 1



72
0Standard i nput < is same as 0<
1Standard output > is same as 1>
2Standard error Must be 2> only

These descriptors are implicitly prefixed to the redirection symbols. For
instance, > and1> mean the same thing to the shell, whi le < and 0< also are
identical. You normally dont need to use the numbers 0 and 1 to prefi x the
redirect symbols because they are the default values. However, we need to
use the descriptor 2> for the standard error:





Without specifying the fi le descriptor with the redirection symbol we dont get
the errors in the file






This works. You can also append diagnostic output in a manner similar to the
one i n which you append standard output:



You can now save error messages in a separate file. This enables you to run
long programs and save error output to be viewed at the end of the day.

1.4.4 Combining Streams

In UNIX, it is also possible to use both i nput and output streams at the same
time and shell i n this case keeps the command ignorant of the source and
destination.




In this case both input and output are redirected.
It is also possible to combi ne < and > operators and the sequence of their use
is immaterial for the shell.






bash>cat bar > errorfi le
cat: cannot open bar: No such fi le or directory
bash>cat errorfile

bash> cat bar 2>errorfi le
bash> cat errorfile
cat: cannot open bar: No such fi le or directory

bash>cat bar 2>> errorfile

bash>cat > my

bash> wc < i nfile > newfile
bash> wc > newfile < i nfile
bash> newfile < infi le wc

COE Unit 2, Lesson 1



73
All the three commands are different commands for the same task. It is also
possible to combi ne the standard output and standard error i n the same
command li ne.

By default, the errors are dumped on the standard error (stderr) and normal
output is sent to standard out (stdout). For example, if you simply type the
following command to compile some C program, then the only normal output
will be sent to stdout, error will still show up on the termi nal.







But if you want both the errors and the usual output (e.g. any warnings, etc.)
to go i nto a single file, then you can use the followi ng command:









2.3 Pipeline

In UNIX, it is desired a lot of times that output of some fi le is fed to another file
and this is used to accomplish a task. For i nstance, the followi ng set of
commands is doing some task:







Now, to count the number of users we can certainly redirect the file user.lst to
make it come from the standard input.





This method of usi ng multiple commands to accomplish tasks has some
obvious disadvantages:
bash> cat newfile nofile 2> errorfile > outfile
bash> cc x.c y.c > compile.out
variable x is not defined.
variable y is redefi ned.
variable z is not defined.

bash> cc x.c y.c > compile.out 2>&1
# Note there is not output printed on
the script
bash> cat compile.out
variable x is not defined.
Warni ng: variable type mismatch.
variable y is redefi ned.
variable z is not defined.

bash> who > user.lst
bash> cat user.lst
araz tty01 May 18 09:32
amol tty02 May 18 11:18
achint tty03 May 18 13:21

bash> wc -l < user.lst
3

COE Unit 2, Lesson 1



74
1. The process is slow. The later command cannot get executed if the earlier
ones are not yet executed.
2. An intermediate file is required that has to be removed after the wc
command has been executed.
3. When handli ng large files, temporary files can built up easily and eat up
the disk space.

Now, shell has a unique and powerful ability to connect the flow of these three
commands, without needing any intermediate files, and each command takes
input from the other. This is accomplished using the pipe (|) operator.

By using the pipes the command sequence shown above can be compressed
to the followi ng si ngle command:





Here, who is said to be piped to wc. No intermediate files are created when
they are used. When a sequence of commands is combi ned together in this
way, a pipeline is said to be formed. The name is appropriate as the
connection it establishes between programs, resembles a plumbing joint. Its
the shell that sets up this interconnection, and, the commands have no
knowledge of it.

The pipe is a source and destination of standard i nput and standard output,
respectively. You can now use one to count the number of files i n the current
directory:





Note that no separate command was designed to tell you that, though the
designers could easily have provided another option to ls to perform this
operation. And because
wc uses standard output, you can redirect this output to a fi le:



Theres no restriction on the number of commands you can use in a pipeline.
But you must know the behavioral properties of these commands to place
them there. Consider this generalized command line:

command1 | command2 | command3 | command4

It should be pretty obvious that command2 and command3 must support both
standard input and standard output. Command1 requires to use standard
output only, while command4 must be able to read from standard i nput. If you
can ensure that, then you can have a chain of these tools connected together.
bash> who | wc -l
3

bash> ls | wc -l
15

bash> ls | wc -l > fkount

COE Unit 2, Lesson 1



75
The commands command2 and command3 who support both streams are
called filters. These will be discussed later.

1.5 Variables

It is possible in shell to have shell variables that can have some values stored
in then and can be later on referenced to get that value or use that values on
the command line or i n shell scripts, we will learn shortly about the shell
scripts. The shell variables are of string types, which means the value is
stored in ASCII rather than in binary format. No type declaration is necessary
before you can use a shell variable. The shell variables are set usi ng a
generali zed form of variable=value , and can be referenced by placing a $
as a prefi x to it. By using the unset command, the variable can be removed.

Example,








NOTE: There should be no space between the variable name, =, and variable
value else, shell will interpret the variable name to be a command and = and
the variable value to be the arguments.

By default the shell variables are initiali zed to null value, but sometimes it is
desirable to explicitly set them to a null value by usi ng any one of the followi ng
constructs:
x= or x= or x=

It is also possible to assign multiple word stri ng to a shell variable, for this
there are two approaches possible:
1. Escape the blank spaces usi ng the escape character \
2. Use the quotes.






1.5.1 Setting strings with the variable names having $

There could be strings containi ng the $ character i n them. It could be for two
reasons:

1. The string inherently contains the $ sign. Example:
My salary per month is $1000

bash> a=4
bash> echo $a
4
bash> unset a
bash> echo a
bash>

bash> a=My name is Amrit
bash> echo $a
My name is Amrit

bash> echo My salary per month is $1000
My salary per month is $1000

COE Unit 2, Lesson 1



76



In this, $1000 is echoed as it is.





In this it is assumed that $1 is a shell variable and thus this tries to access the
value which is undefi ned, and so replaces it with a null string.

Thus, there is a difference in the way the shell handles the stri ngs if used in
the single quotes and double quotes.

2. The string uses a variable name with $ character to replace the variable
with its value.

Example,
My salary per month is \$$x
The variable x is to be replaced with the salary amount and preceded with a
dollar sign.

1.5.2 Types of variables

As a convention, variables are used with uppercase names. Bash keeps a list
of two types of variables:

Global variables

Global variables or environment variables are available i n all shells. The env
or printenv commands can be used to display environment variables.

Local variables

Local variables are only available i n the current shell. Using the set builtin
command without any options will display a list of all variables (including
environment variables) and functions. The output will be sorted accordi ng to
the current locale and displayed i n a reusable format.

A local variable is not automatically available to the sub shell unless exported.


1.5.3 Exporting variables

A variable created like the ones in the example above is only available to the
current shell. It is a local variable. Child processes of the current shell will not
be aware of this variable. In order to pass variables to a subshell, we need to
export them using the export builti n command. Variables that are exported
bash> echo My salary per month is $1000
My salary per month is 000

COE Unit 2, Lesson 1



77
are referred to as environment variables. Setti ng and exporting is usually
done i n one step:

export VARNAME="value"

A subshell can change variables it i nherited from the parent, but the changes
made by the chi ld don't affect the parent. This is demonstrated i n the
example:


1.5.4 Using Shell Variables

In UNIX, it is possible to set variables to some path, command and command
substitution to set the output of the command. We will have a look at the
usage examples wherein the variables can be set to these values and then
can be used as substitutes of the operations.

Setting the path name


Thus, i n some variables we can set the pathname and then cd command can
be used to access that pathname again and again.
NOTE: In practical applications and day to day life, this can be a great
practice to be done, it is because there are sometimes long absolute
pathnames that can be actually stored in some variables and can be
accessed again and again without facing the trouble of memori zi ng them or
typi ng long pathnames.

bash> full_name=Amrit Swarup"
bash> bash
bash> echo $full_name

bash> exit
bash> export full_name
bash> bash
bash> echo $full_name
Amrit Swarup
bash> export full_name=Charan Si ngh"
bash> echo $full_name
Charan Singh

bash> exit
bash> echo $full_name
Amrit Swarup
bash> x=/home/ganesh/father
bash> cd $x
bash> pwd
/home/ganesh/father
COE Unit 2, Lesson 1



78

1.6 Command Substitution

It is possible in UNIX systems to connect two commands. It is possible to
connect the standard output of a command to the standard input of another
command using the pipelines or usi ng the redirection.
The shell allows obtaining the argument of a command from another
command; this feature is called command substitution. In some features, it
is sometimes required that the command argument is the output of another
command. For example, we need to print some string which tells us about the
number of files i n the directory:

There are 24 files in the directory.

So, how will you achieve this? The shell has this feature.


So, you have substituted the command i n the string which then acts as an
argument to the other command (echo), by placi ng the command i n between
two `` (backquote or backtick). This is a metacharacter that shell looks at (we
cover metacharacters ahead). If enclosed in between the back quotes the
shell first executes the command, and then replaces the enclosed command
text with the output of the command.

By now, we have seen that all the metacharacters behaves in the similar
manner when used with either the double or single quotes. Lets try this one:


So, they are not i nterpreted by the shell, if placed i n between the single
quotes.


1.7 Pattern Matching The Wild Cards

While worki ng with the UNIX system we often lands up i n the situation when
we have to perform operations which can be used to apply the same
operations collectively on a larger group. Typically, listing files starti ng with
name lesson:

ls l lesson01 lesson02 lesson03.

This can also be represented as:

ls l lesson*
bash> echo There are `ls | wc l` fi les in the directory.
There are 24 files in the directory.
$echo There are `ls | wc l` files in the directory.
There are `ls | wc l` fi les in the directory.
COE Unit 2, Lesson 1



79

These are called the metacharacters, these are the special characters that
the shell understands and does some expandi ng operations based on the
character and its intended use. Lets now discuss the metacharactes and
their attributes in some details

1.7.1 The * & ?

The *, known as a metacharacter, is one of the characters of the shells
special set. This character matches any number of characters (including
none).When the * is appended to the string lesson, the pattern lesson*
matches fi lenames beginning with the stri ng lessoni ncludi ng the file lesson.
It thus matches all the files specified in the previous command line. You can
now use this pattern as an argument to ls:


When the shell encounters this command line, it immediately identifies the *
as a metacharacter. It then creates a list of fi les from the current directory that
match this pattern. It reconstructs the command li ne as below:

NOTE: Wi ndows users may be surprised to know that the * may occur
anywhere in a filename, and not merely at the end. Thus, *lesson* matches all
the following filenames: lesson newlesson lesson03 lesson03.txt.

The next metacharacter is the ? This matches a single character. When used
with the same string lesson (as lesson?), the shell matches all five-character
filenames beginning with lesson. Place another? at the end of t his string, and
you have the pattern lesson??. Use both these expressions separately, and
the meaning of the ? will be obvious:








These metacharacters are also called wild cards (to depict something like a
joker that can match any card). In the upcoming sessions we will take a look
at other wild cards.


1.8 The Character Class
bash> ls x lesson*
lesson lesson01 lesson02 lesson03 lesson04 lesson05
lessonA lesson.pl lesson.c lesson.cpp
bash> ls x lesson lesson01 lesson02 lesson03 lesson04
lesson05 lessonA lesson.pl lesson.c lesson.cpp
bash> ls -x lesson?
lessonx lessony lessonz
bash> ls -x lesson??
lesson01 lesson02 lesson03 lesson04 lesson15 lesson16
lesson17

COE Unit 2, Lesson 1



80

It can be noted i n the previous examples that the patterns which we have
framed i n the previous examples are not very restrictive and specific. If we
want to list only lessonA and lessonZ amongst the entire lesson we cannot do
that usi ng the patterns, we have studied by now. To do this we need a
character class for specific matching.

The character class uses two more metacharacters represented by a pair of
brackets
[ ]. You can have multiple characters i nside this enclosure, but matchi ng takes
place for a si ngle character in the class. For example, a single character
expression that can take one of the values 1, 2 or 4, can be represented by
the expression:
[124] Either 1, 2 or 4
This can be combi ned with any string or another wild-card expression, so
selecting the files lesson01, lesson02, lesson03, lesson04 becomes a simple
matter :






1.9 Matching a dot (.)

In UNIX file systems, there are lots of files that start with dots (.). It is
sometimes desirable to do some collecti ve wild card operations on these files.
Example can be,




This will not show the files starting with dots. To match the dots i n the starting
of a file name it is important to use the dot literally.

But it is possible to match as many dots, if they occur in the middle of the
filename.




NOTE: Using * with rm

bash> ls -x .*
.exrc .encrc .profile
bash> ls x my*c
my_file.c my.c my.stored.c
bash> ls x lesson0[1234]
lesson01 lesson02 lesson03 lesson04

bash> ls x *
lesson01 lesson02 .

COE Unit 2, Lesson 1



81
Lets discuss a potential issue which each UNIX user faces at least once i n his
life that is the use of very beautiful and powerful command

bash> rm *

To remove all the files starti ng with lesson we can use the command

bash> rm lesson*

But with a bit of carelessness you can type

bash> rm lesson *

And you have messed up everything beyond repair. Now be ready to have a
scolding from the system administrator. So be careful while using this
command


1.10 Summing up

Shell is a core component of the UNIX Operating System. It i nterprets the
user commands and provides powerful features like Redirection, Pipes,
Metacharacters etc.

Bash is the shell, compatible with the Bourne shell and incorporating many
useful features from other shells. Bashs biggest feature is a powerful history
support and command li ne editing. In our course, we use the BASH shell to
explai n the examples. In other shells the implementation is slightly different.



Self-check Questions

1. While a command is being executed the shell prompts the user for another
command and puts that command i n its priority queue. (True/False)
2. Shell is i n __________________ (execution/sleep) mode while there is no
command keyed in on the terminal and another command is running.
3. The redirection symbol > appends the redirected text to a file. (True/False)
4. Get the odd one out: The possible sources of standard input are:
a. Pipe
b. Keyboard
c. Printer
d. file





1.11 Answers to the Self-Check Questions
COE Unit 2, Lesson 1



82

1. False
2. Sleep
3. False
4. (c)


1.12 Terminal Questions

1. What is exporti ng a variable and why is it used?
2. Explai n what is a metacharacter? Why do you need it?
3. Explai n the difference between pipes and redirection.



COE Unit 2, Lesson 2



LESSON 2 SHELL SCRIPTING AND DEBUGGING

2. SHELL SCRIPTING AND DEBUGGING..................................................................... 85
2.0 OBJECTIVES ............................................................................................................ 85
2.1 INTRODUCTION ........................................................................................................ 85
2.2 CREATING AND RUNNING A SCRIPT ......................................................................... 85
2.2.1 myScript.sh........................................................................................................ 85
2.2.2 Writing and naming .......................................................................................... 86
2.2.3 Executing the Script ......................................................................................... 86
2.3 SCRIPT BASICS ....................................................................................................... 88
2.3.1 Which shell will Run the Script? ..................................................................... 88
2.3.2 Adding comments............................................................................................. 88
2.4 DEBUGGING BASH SCRIPTS ................................................................................... 89
2.4.1 Debugging On the Entire Script ..................................................................... 89
2.4.2 Debugging On Part(s) Of the Script .............................................................. 90
2.5 QUOTING ................................................................................................................. 93
2.5.1 Escape Character............................................................................................. 93
2.5.2 Single Quotes ................................................................................................... 94
2.5.3 Double-Quotes.................................................................................................. 94
2.6 SPECIAL VARIABLES................................................................................................ 95
2.7 SUMMING UP ........................................................................................................... 98
2.8 ANSWERS TO THE SELF-CHECK QUESTIONS.......................................................... 98
2.9 TERMINAL QUESTIONS ............................................................................................ 98



COE Unit 2, Lesson 2



85
2. Shell Scripting and Debugging




To be able to write effecti ve scripts, it is important to know the structure of a script
and also be able to debug it if required. Therefore it is important to understand these
concepts as they would form a base for subsequent chapters.



2.0 Objectives

After goi ng through this lesson, you will be able to:

Write a simple script
Define the shell type that should execute the script
Put comments in a script
Change permissions on a script
Execute and debug a script


2.1 Introduction

This chapter is to enable the student to i ndulge i n writi ng scripts with low
complexity. It is also pointed out that debugging is also needed at times. The
student would be enabled to debug effectively usi ng the methodology
described in this chapter.


2.2 Creating and running a script

2.2.1 myScript.sh

In this example we use the echo Bash built-in to i nform the user about what is
going to happen, before the task that will create the output is executed.
The script welcomes the user, gi ves current date and time, lists the directory
contents and searches for the text Blue in all files starti ng with the name
demo and stores the result in the file - searchResult .txt. For the scripts in
this chapter we are assuming they are created in the followi ng directory:
~/scripts
COE Unit 2, Lesson 2



86




















2.2.2 Writing and naming

To create a shell script:

Open a new empty file in your editor (vi, vim, gvim, emacs, gedit, dtpad etc.).

Put UNIX commands in the new empty file, like you would enter them on the
command li ne. As discussed i n the previous chapter, commands can be shell
functions, shell built-ins, UNIX commands and other scripts.

Give your script a sensible name that gives a hint about what the script does.
Make sure that your script name does not conflict with existi ng commands. In
order to ensure that no confusion can rise, script names often end in .sh; even
so, there might be other scripts on your system with the same name as the
one you chose.

Check using which, where is and other commands for findi ng information
about programs and files:
which a script_name
whereis script_name
locate script_name

2.2.3 Executing the Script

The script can run like any other command:
myScript.sh
#!/bin/bash
echo ""
echo "This is my first shell script."
USERNAME=`whoami`
echo "Welcome $USERNAME"
echo ""
CURRENT_TIME=`date +%T`
CURRENT_DATE=`date +%D`
echo "Date: $CURRENT_DATE Time:
$CURRENT_TIME"
echo ""
echo ""
echo "Here are the files i n your current directory."
echo ""
ls
grep Blue demo* > searchResult.txt
COE Unit 2, Lesson 2



87



























The above mentioned scheme is the most common way to execute a script. It
is preferred to execute the script like this in a sub shell. The variables,
functions and aliases created in this sub shell are only known to the particular
bash session of that sub shell. When that shell exits and the parent shell
regains control, everythi ng is cleaned up.

Remember to add the directory to the contents of the PATH variable.
It is essentially a colon separated list of directories. When you execute a
command, the shell searches through each of these directories, one by one,
until it finds a directory where the executable exists.

export PATH="$PATH:~/scripts"

If you did not put the scripts directory in your PATH, and the current directory
is not in the PATH either, you need to specify the path of the script and
activate it. If it is i n the current directory acti vate the script like this:
./script_name.sh

A script can also explicitly be executed by a gi ven shell, but generally we only
do this if we want to obtai n special behavior, such as checki ng if the script
works with another shell or pri nti ng traces for debugging:

rbash script_name.sh
bash> chmod u+x myScript.sh

bash> ls l myScript.sh
rwxrwr 1 salil salil 456 Dec 24 17:11
myScript.sh

bash> myScript.sh

Check that you really
obtained the permissions
that you want

This is my first shell script.
Welcome salil
Date: 12/21/07 Time: 12:26:40
Here are the files i n your current directory.
demo.txt
demo2.txt
demo3.txt
lab
myScript.sh
newfile.txt
output.txt
update.ppt

The script should have execute
permissions for the correct owners
in order to be runnable.
COE Unit 2, Lesson 2



88
sh script_name.sh
bash x script_name.sh

The specified shell will start as a sub shell of your current shell and executes
the script. This is done when you want the script to start up with specific
options or under specific conditions which are not specified in the script.

If you don' t want to start a new shell but execute the script in the current shell,
you source it:

source script_name.sh

The script does not need execute permission in this case. Commands are
executed i n the current shell context, so any changes made to your
environment will be available when the script finishes execution


2.3 Script Basics

2.3.1 Which shell will Run the Script?

When runni ng a script i n a subshell, you should defi ne which shell should run
the script. Consider for example that your login shell may be C Shell but
your script may be contai ning bash commands. The shell type i n which you
wrote the script might not be the default on your system, so commands you
entered might result in errors when executed by the wrong shell.

The first line of the script determines the shell in which the script will run. The
first two characters of the first li ne should be #!, then follows the path to the
shell that should i nterpret the commands that follow. Blank lines are also
considered to be li nes, so don't start your script with an empty li ne.

For the purpose of this course, all scripts will start with the li ne
#!/bin/bash

2.3.2 Adding comments

It is a good practice to add comments i nto your scripts. Comments help in
future when you will need to enhance or fix the script. Comments also make
the scripts more readable.


COE Unit 2, Lesson 2



89



























Usually, the i nitial few lines of script should indicate about the purpose of the
script. And then you should put comments in the code too.


2.4 Debugging Bash Scripts

2.4.1 Debugging On the Entire Script

Bash provides extensi ve debugging features. The most common is to start up
the sub shell with the x option, which will run the entire script in debug mode.
Traces of each command plus its arguments are pri nted to standard output
after the commands have been expanded but before they are executed.

Followi ng is the commented_script1.sh script ran in debug mode. Note again
that the added comments are not visible i n the output of the script.
commented_script1.sh
#!/bin/bash
# This script clears the terminal, displays a greeting and
gives information
# about currently connected users. The current directory
contents are
# displayed too




clear # clear termi nal window

echo "The script starts now."

echo "Hi, $USER!" # dollar sign is used to get
content of variable
echo

echo "List of connected users:"
echo
w # show who is logged on
echo

echo "Displayi ng the contents of this directory"
ls # To list the contents of this
directory


The first line of the script determines
the shell to start BASH in this case
This is a Comment. Everything the shell
encounters after a hash mark on a line is ignored.

COE Unit 2, Lesson 2



90





























2.4.2 Debugging On Part(s) Of the Script

Using the set Bash built-in you can run in normal mode those portions of the
script of which you are sure they are without fault, and display debugging
information only for troublesome zones.

Say we are not sure what the w command will do i n the example
commentedscript1.sh, then we could enclose it in the script like this:



set x # activate debuggi ng from here

w

set +x # stop debugging from here

bash> bash x commented_script1.sh
+ clear

+ echo ' The script starts now.'
The script starts now.
+ echo ' Hi, sali l!'
Hi, sali l!
+ echo

+ echo 'List of connected users:'
List of connected users:
+ echo

+ w
4:50pm up 18 days, 6:49, 4 users, load
average: 0.58, 0.62, 0.40
USER TTY FROM LOGIN@ IDLE JCPU
PCPU WHAT
root tty2 Sat 2pm 5:36m 0.24s
0.05s bash
salil :0 Sat 2pm ? 0.00s ?

salil pts/2 Sat 2pm 43:13 0.13s
0.06s /usr/bi n/screen

+ echo

+ echo 'Displayi ng the contents of this directory'
Displaying the contents of this directory
+ ls
demo1.txt demo2.txt myScript.sh

COE Unit 2, Lesson 2



91
Output then looks like this:



























The table below gi ves an overview of other useful Bash options:

Table Overview of set debugging options

Short
notation
Long notation Result
set f set o noglob
Disable file name generation usi ng
metacharacters (globbing).
set v set o verbose
Prints shell input li nes as they are
read.
set x set o xtrace
Print command traces before
executi ng command.

The dash is used to activate a shell option and a plus to deactivate it.
In the example below, we demonstrate these options on the command li ne:

Alternati vely, these modes can be specified in the script itself, by adding the
desired options to the first li ne shell declaration. Options can be combined, as
is usually the case with UNIX commands:

#!/bin/bash xv

bash> script1.sh
The script starts now.
Hi, sali l!

List of connected users:

+ w
5:00pm up 18 days, 7:00, 4 users, load average: 0.79,
0.39, 0.33
USER TTY FROM LOGIN@ IDLE JCPU PCPU
WHAT
Root tty2 Sat 2pm 5:47m 0.24s 0.05s
bash
salil :0 Sat 2pm ? 0.00s ?

salil pts/2 Sat 2pm 54:02 0.13s 0.06s
/usr/bin/screen
+ set +x

Displaying the contents of this directory
demo1.txt demo2.txt myScript.sh

bash>
COE Unit 2, Lesson 2



92




Once you found the buggy part of your script, you can add echo statements
before each command of which you are unsure, so that you will see exactly
where and why thi ngs don't work. In the example commentedscript1.sh
script, it could be done like this, still assuming that the displayi ng of users
gives us problems:



In more advanced scripts, the echo can be i nserted to display the content of
variables at different stages i n the script, so that flaws can be detected:



bash> set v

bash> ls
ls
commentedscripts.sh script1.sh

bash> set +v
set +v

bash> ls *
commentedscripts.sh script1.sh

bash> set f

bash> ls *
ls: *: No such file or directory

bash> touch *

bash> ls
* commentedscripts.sh script1.sh

bash> rm *
bash> ls

commentedscripts.sh script1.sh

echo "debug message: now attempting to start w
command"; w

echo "Variable VARNAME is now set to $VARNAME."

COE Unit 2, Lesson 2



93

2.5 Quoting

Quoting is used to remove the special meani ng of certain characters or words
to the shell. Quoting can be used to disable special treatment for special
characters (to preserve their literal meani ng), to prevent reserved words from
being recognized as such, and to prevent parameter expansion. The
application should quote the following characters if they are to represent
themselves:
| & ; < > ( ) $ ` \ " ' <space> <tab> <newline>

There are three quoting mechanisms:

1. The escape character
2. Single quotes
3. Double quotes

2.5.1 Escape Character

A non-quoted backslash \ is the Bash escape character. It preserves the
literal value of the next character that follows, with the exception of newli ne. If
a \newline pair appears, and the backslash itself is not quoted, the \newline is
treated as a line continuation (that is, it is removed from the i nput stream and
effecti vely ignored).











The following script shows the effect of backslash on newline
bash> date=26122007

bash> echo $date
26122007

bash> echo \$date
$date


Variable date is created and set to
hold a value. The first echo
displays the value of the variable,
but for the second, the dollar sign
is escaped.

escape.sh
#!/bin/bash

echo "Statement 1: This will print
as two li nes."

echo "Statement 2: This will print \
as one li ne."

COE Unit 2, Lesson 2



94


On running this script:







2.5.2 Single Quotes

Enclosi ng characters in single quotes (' ') preserves the literal value of each
character within the quotes. A si ngle quote may not occur between si ngle
quotes, even when preceded by a backslash.
Example:




2.5.3 Double-Quotes

Enclosi ng characters in double-quotes ( " " ) shall preserve the literal value of
all characters withi n the double-quotes, with the exception of the characters
dollar sign $, backquote ` and \.
The characters $ and ` retai n their special meani ng withi n double quotes.
The backslash retai ns its special meaning only when followed by one of the
following characters: $, `, ", \, or newli ne.















bash> escape.sh
Statement 1: This wi ll print
as two li nes
Statement 2: This wi ll print as one li ne

bash> echo '$date'
$date

bash> echo "$date"
20021226

bash> echo "`date`"
Sun Apr 20 11:22:06 CEST 2003

bash> echo "I'd say: \"Go for it!\""
I'd say: "Go for it!"

bash> echo "In DOS directories are separated by \\
character"
In DOS directories are separated by \ character



COE Unit 2, Lesson 2



95
2.6 Special Variables

There are some variables which are set internally by the shell and which are
available to the user. The followi ng table lists some of them:

Variable Definition
$0
Expands to the name of the shell script or command
currently being executed or the name of the shell
$1
Positional parameter #1. Similarly for 2,3..9. For 10
use ${10}
$*
Expands to the positional parameters, starti ng from
one ($1). When the expansion occurs within double
quotes, it expands to a single word with the value of
each parameter separated by the first character of
the IFS (Refer note below) special variable.
$@
Expands to the positional parameters, starti ng from
one ($1). When the expansion occurs within double
quotes, each parameter expands to a separate
word.
$#
Expands to the total number of positional
parameters i n decimal.
$?
The exit status of the last command executed is
given as a decimal string.
$- Flags passed to script (usi ng set)
$$ Expands to the process ID of the shell.
$!
Expands to the process ID of the most recently
executed background command.

Note: $IFS or the internal field separator is a variable which determines how
Bash recognizes fields, or word boundaries, when it i nterprets character
strings. $IFS defaults to whitespace.

A positional parameter is a variable within a shell script whose value is set
from an argument specified on the command line that invokes the script.
Positional parameters are numbered and are referred to with a preceding ``$'' :
$1, $2, $3, and so on. A shell program may reference up to nine positional
parameters. If a shell program is invoked with a command li ne that appears
like this:
my_script.sh pp1 pp2 pp3 pp4 pp5 pp6 pp7 pp8 pp9

then positional parameter $1 within the script is assigned the value pp1,
positional parameter $2 is assigned the value pp2, and so on, at the time the
shell script is i nvoked.
COE Unit 2, Lesson 2



96
















Upon execution one could give any numbers of arguments:

















When a UNIX command runs, it can return a numeric exit status value to the
process that called (started) it. The status can tell the calling process whether
the command succeeded or failed. Many (but not all) UNIX commands return
a status of zero if everything was okay or non-zero (1, 2, etc.) if something
went wrong. A few commands, like grep and diff, return a different non-zero
status for different kinds of problems. See your online manual pages to find
out.

bash> positional.sh one two three four five
one is the first positional parameter, $1.
two is the second positional parameter, $2.
three is the third positional parameter, $3.

The total number of positional parameters is 5.

bash> positional.sh one two
one is the first positional parameter, $1.
two is the second positional parameter, $2.
is the third positional parameter, $3.

The total number of positional parameters is 2.


$3 is empty
#!/bin/bash

# positional.sh
# This script reads 3 positional parameters and pri nts
them out.

PAR1="$1"
PAR2="$2"
PAR3="$3"

echo "$1 is the first positional parameter, \$1."
echo "$2 is the second positional parameter, \$2."
echo "$3 is the third positional parameter, \$3."
echo
echo "The total number of positional parameters is $#."


COE Unit 2, Lesson 2



97
More examples:


























The following script shows the use of $* special variable:









Upon execution:










bash> grep dictionary /usr/share/dict/words
dictionary


bash> echo $$
10662

bash> mozilla &
[1] 11064

bash> echo $!
11064

bash> echo $0
bash

bash> echo $?
0

bash> ls abc
ls: abc: No such file or directory

bash> echo $?
1


User rahul starts enteri ng the
grep command.

The process ID of his shell is
10662. After putti ng a job i n the
background, the ! holds the
process ID of the backgrounded
job.


The shell runni ng is bash.

When a mistake is made, ?
holds an exit status
different from 0 (zero). Else
the status is 0.

spl_var_eg.sh
#!/bin/bash
echo My Process ID is: $$
echo The number of Arguments is $#
echo The Arguments are $*
grep $1 $2
echo \Job Over


bash> spl_var_eg.sh Blue demo1.txt
My Process ID is: 23465
The number of Arguments is 2
The Arguments are Blue demo1.txt
My favourite colour is Blue.

Job Over


COE Unit 2, Lesson 2



98

2.7 Summing Up

A shell script is a reusable series of commands put in an executable text file.
Any text editor can be used to write scripts.

Scripts start with #! followed by the path to the shell executing the commands
from the script. Comments are added to a script for your own future reference,
and also to make it understandable for other users. It is better to have too
many explanations than not enough.

Debugging a script can be done usi ng shell options. Shell options can be
used for partial debugging or for analyzing the entire script. Inserting echo
commands at strategic locations is also a common troubleshooti ng technique.



Self-check Questions

1. What do you need to add to the first line of the script to i ndicate Bash shell?
2. Why are comments needed and how do you add them?
3. What happens when a script is executed with the option "bash -x" option?




2.8 Answers to the Self-Check questions

1. #!/bin/bash
2. Comments are useful to enlighten the reader about the script and make it
comprehendible. A comment is added i n the format: # <the comment>
3. It will run the entire script i n debug mode


2.9 Terminal Questions

1. What are the different steps for creating a shell script?
2. How would you debug a part of the script?
3. What are the different shell debuggi ng options?
4. Why is Quoti ng used? Gi ve examples.


COE Unit 2, Lesson 3



LESSON 3 CONDITIONAL STATEMENTS



COE Unit 2, Lesson 3



101
3. Conditional statements




One of the advanced concepts, conditional statements are very frequently used in
scripts. A clear understanding of this concept is very important.



3.0 Objectives

After goi ng through this lesson, you will learn about:

The if statement
Using the exit status of a command
Comparing and testing input and files
If-then-else constructs
If-then-elif-else constructs
Using and testi ng the positional parameters
Nested if statements
Using case statements


3.1 Introduction

This chapter introduces the use of conditionals in Bash scripts. This would
enable the student to write scripts that are more powerful and cater to
different conditions.


3.2 Introduction to if

3.2.1 General

At times you need to specify different courses of action to be taken i n a shell
script, depending on the success or failure of a command. The if construction
allows you to specify such conditions.

The most compact syntax of the if command is:

if TESTCOMMANDS; then CONSEQUENTCOMMANDS; fi

Example: For Checking shell options



COE Unit 2, Lesson 3



102








The TESTCOMMAND list is executed, and if its return status is zero, the
CONSEQUENTCOMMANDS list is executed. The return status is the exit
status of the last command executed, or zero if no condition tested true.

The TESTCOMMAND often i nvolves numerical or string comparison tests,
but it can also be any command that returns a status of zero when it succeeds
and some other status when it fails. Unary expressions are often used to
exami ne the status of a fi le. If the FILE argument to one of the primaries is of
the form /dev/fd/N, then file descriptor "N" is checked. stdin, stdout and stderr
and their respecti ve file descriptors may also be used for tests.

Expressions used with if

The table below contains an overview of the socalled "primaries" that make
up the TESTCOMMAND command or list of commands. These primaries are
put between square brackets to indicate the test of a conditional expression.

Table Primary expressions

Primary Meaning
[ -a FILE ] True if FILE exists
[ -o
OPTIONNAME ]
True if shell option OPTIONNAME is
enabled
[ -z STRING ] True of the length of STRING is non-
zero.
[ -n STRING ]or [
STRING]
True of the length of STRING is non-Zero
[ STRING1 ==
STRING2 ]
True if the stri ngs are equal. =may be
used i nstead of == for strict POSIX
compliance
[STRING1! =
STRING2]
True if the stri ngs are not equal
[ STRING1<
STRING2 ]
True if STRING1 sorts before STRING2
lexicographically in the current locale.
[ STRING1>
STRING2 ]
True if STRING1 sorts after STRING2
lexicographically in the current locale.
[ ARG1 OP
ARG2 ]
OP is one of eq, -ne, -lt, -le,-gt or ge.
These arithmetic binary operators return
true if ARG1 is equal to, not equal to,
less than, less than or equal to, greater
than, or greater than or equal to ARG2.
ARG1 and ARG2 are i ntegers.
# These lines will pri nt a message if the noclobber
option is set

if [ o noclobber ]
then
echo "Your files are protected against accidental
overwriting using redirection."
fi

COE Unit 2, Lesson 3



103

Expressions may be combined usi ng the following operators, listed in
decreasing order of precedence:


Table Combining expressions

Operation Effect
[ ! EXPR ] True if EXPR is false
[ (EXPR) ]
Returns the value of EXPR. This may be
used to override the normal precedence of
operators.
[ EXPR1 a
EXPR2 ]
True if both EXPR1 and EXPR2 are True
[ EXPR1 o
EXPR2 ]
True if either EXPR1 and EXPR2 is true.


The [ (or test) builti n evaluates conditional expressions using a set of rules
based on the number of arguments. More information about this subject can
be found in the Bash documentation. Just like, the if is closed with fi, the
opening angular bracket should be closed after the conditions have been
listed.

Commands following the then statement

The CONSEQUENTCOMMANDS list that follows the then statement can be
any valid UNIX command, any executable program, any executable shell
script or any shell statement, with the exception of the closing fi. It is important
to remember that the then and fi are considered to be separated statements in
the shell.
Therefore, when issued on the command line, they are separated by a
semicolon.
In a script, the different parts of the if statement are usually wellseparated.
Below are a couple of simple examples.

Checking files

The first example checks for the existence of a file:
COE Unit 2, Lesson 3



104






















3.2.2 Simple applications of if

Testing exit status










Numeric comparisons



:











#!/bin/bash

echo "This scripts checks the existence of the demo file."
echo "Checki ng..."
if [ f /usr/guest/demo.txt ]
then
echo "/usr/guest/demo.txt fi le exists."
fi
echo
echo "...done."

bash> ./filecheck.sh
This scripts checks the existence of the messages file.
Checking...
/usr/guest/demo.txt file exists.

...done.

bash> if [ $? eq 0 ]
> then echo 'That was a good job!'
> fi
That was a good job!

bash>
bash> num=`wc l demo1.txt`

bash> echo $num
201

bash> if [ "$num" gt "150" ]
> then echo ; echo "This is a big file."
> echo ; fi

This is a big file.

bash>

filecheck.sh

COE Unit 2, Lesson 3



105

String comparisons











3.3 More advanced if usage


3.3.1 if-then-else constructs

Like the CONSEQUENTCOMMANDS list followi ng the then statement, the
ALTERNATECONSEQUENTCOMMANDS list followi ng the else statement
can hold any UNIXstyle command that returns an exit status.

Example 1
On executing the script we get:










Example 2















dir=`pwd` # /tmp/proc
updir=`basename $dir` # /tmp
if [ "$updir"X != /tmpX'' ]; then
echo "You need to be i n a subdirectory of /tmp."
exit 1;
fi

bash> bash x fun_weigh.sh 55 169
+ weight=55
+ height=169
+ idealweight=59
+ '[' 55 le 59 ']'
+ echo ' You should eat a bit more fat.'
You should eat a bit more fat.

fun_weigh.sh

fun_weigh.sh

#!/bin/bash

# This script prints a message about your weight if you
give it your
# weight i n kilos and hight in centimeters.

weight="$1"
height="$2"
idealweight=$[$height 110]

if [ $weight le $idealweight ] ; then
echo "You should eat a bit more fat."
else
echo "You should eat a bit more fruit."
fi


COE Unit 2, Lesson 3



106
Testi ng the number of arguments - The previous script is modified so that it
prints a message if more or less than 2 arguments are given:



The first argument is referred to as $1, the second as $2 and so on. The total
number of arguments is stored in $#.

3.3.2 if-then-elif-else constructs

This is the full form of the if statement:

if TESTCOMMANDS; then
CONSEQUENTCOMMANDS;
elif MORETESTCOMMANDS; then
MORECONSEQUENTCOMMANDS;
else ALTERNATECONSEQUENTCOMMANDS;
fi

#!/bin/bash

# This script prints a message about your weight if
you gi ve it your
# weight i n kilos and hight in centimeters.

if [ ! $# == 2 ]; then
echo "Usage: $0 weight_i n_kilos
length_in_centimeters"
exit
fi

weight="$1"
height="$2"
idealweight=$[$height 110]

if [ $weight le $idealweight ] ; then
echo "You should eat a bit more fat."
else
echo "You should eat a bit more fruit."
fi

bash> fun_weigh.sh 70 150
You should eat a bit more fruit.

bash> fun_weigh.sh 70 150 33
Usage: ./weight.sh weight_i n_kilos
length_in_centimeters

fun_weigh.sh
COE Unit 2, Lesson 3



107



























3.3.3 Returning the exit status using if

Sometimes, you test for a condition and fi nd that it fails. You would rather like
the program to terminate since there is no point in conti nui ng further if an
essential resource is missingsay the file you want to search. The exit
statement is used to prematurely terminate a program.

The exit statement takes an optional argument. This argument is the integer
exit status code, which is passed back to the parent and stored i n the $?
variable.









In this example if the number of arguments is not 2 then the execution is
exited (with a code 2) and a message about the usage is printed.




#!/bin/bash
# This script will test if we're i n a leap year or not.

year=`date +%Y`

if [ $[$year % 400] eq "0" ]; then
echo "This is a leap year. February has 29 days."
elif [ $[$year % 4] eq 0 ]; then
if [ $[$year % 100] ne 0 ]; then
echo "This is a leap year, February has 29
days."
else
echo "This is not a leap year. February has 28
days."
fi
else
echo "This is not a leap year. February has 28 days."
fi

bash> date
Fri Dec 21 17:14:28 IST 2007

bash> testleap.sh
This is not a leap year.

testleap.sh

Also note nested
ifs here. You may
use as many
levels of nested ifs
as you can
logically manage.

#!/bin/bash
if [ $# -ne 2 ]; then
echo "Usage $0 \<file1\> \<file2\>";
exit 2
fi
...<rest of script>
COE Unit 2, Lesson 3



108
3.4 Using case statements

Nested if statements might be nice, but as soon as you are confronted with a
couple of different possible actions to take, they tend to confuse. For the more
complex conditionals, use the case syntax:

case EXPRESSION in CASE1) COMMANDLIST;; CASE2)
COMMANDLIST;; ... CASEN)
COMMANDLIST;; esac

Each case is an expression matching a pattern. The commands in the
COMMANDLIST for the first match are executed. The "|" symbol may be
used for separating multiple patterns, and the ")" operator terminates a pattern
list. Each case plus its according commands are called a clause. Each clause
must be termi nated with ";;". Each case statement is ended with the esac
statement.

In the example, we demonstrate use of case for getti ng the disk usage.

















Echo interprets and treats the character c as special because of the
backslash. The \c here represents an escape sequence, which positions the
cursor immediately after the argument instead of the next line.

The read statement takes input from the user, thereby maki ng the script
interacti ve. The input is read into a variable (selection in this case). The output
is as follows:
disk_utility.sh
#!/bin/bash

echo \n 1. The free disk space\n 2. Space consumed by
this user
3. Exit\n\n SELECTION: \c

read selection
case $selection in
1) df ;;
2) du s $HOME ;;
3) exit ;;
*) echo Not a valid option
esac

COE Unit 2, Lesson 3



109












3.5 Summary

In this chapter we learned how to build conditions i nto our scripts so that
different actions can be undertaken upon success or failure of a command.
The actions can be determined usi ng the if statement. This allows you to
perform arithmetic and stri ng comparisons, and testing of exit code, i nput and
files needed by the script.

A simple If-then-fi test often precedes commands i n a shell script in order to
prevent output generation, so that the script can easily be run in the
background or through the cron facility. More complex defi nitions of conditions
are usually put in a case statement.



Self-check Questions

1. What is the use of the "if" statement?
2. What is the exit status of a command? What is its normal value and where is the
value stored?



3.6 Answers to the Self-Check questions

1. The "if" statement takes two-way decisions dependi ng on the fulfillment of a
certain condition.
2. The exit status is an integer that represents the success or failure of a
command. It has the value 0 when the command executes successfully and is
stored in the parameter $?


3.8 Terminal Questions

1. List some applications of the if-then-elif-else statement.
2. Give an example of Case usage.

bash> disk_utility.sh

1. The free disk space
2. Space consumed by this user
3. Exit

SELECTION: 2
456100 /home/pallavi

COE Unit 2, Lesson 4



LESSON 4 REPETITIVE TASKS

4. REPETITIVE TASKS .................................................................................................... 113
4.0 OBJECTIVES .......................................................................................................... 113
4.1 INTRODUCTION ...................................................................................................... 113
4.2 THE FOR LOOP....................................................................................................... 113
4.2.1 How does it work? .......................................................................................... 113
4.2.2 Examples ......................................................................................................... 114
4.3 THE WHILE LOOP ................................................................................................... 115
4.3.1 What is it? ........................................................................................................ 115
4.3.2 Examples ........................................................................................................... 115
4.4 THE UNTIL LOOP .................................................................................................... 117
4.4.1 What is it? ........................................................................................................ 117
4.4.2 Example ........................................................................................................... 118
4.5 I/O REDIRECTION AND LOOPS ............................................................................... 118
4.5.1 Input redirection .............................................................................................. 119
4.5.2 Output redirection ........................................................................................... 119
4.6 BREAK AND CONTINUE ................................................................................................ 119
4.6.1 The break builtin........................................................................................... 120
4.6.2 The continue builtin...................................................................................... 121
4.6.3 Examples ......................................................................................................... 121
4.7 MAKING MENUS WITH THE SELECT BUILTIN ........................................................ 123
4.7.1 General ............................................................................................................ 123
4.7.2 Submenus ....................................................................................................... 126
4.8 THE SHIFT BUILTIN .............................................................................................. 126
4.8.1 What does it do?............................................................................................. 126
4.8.2 Examples ......................................................................................................... 126
4.9 SUMMARY .............................................................................................................. 127
4.10 ANSWERS TO THE SELF-CHECK QUESTIONS........................................................ 128
4.11 TERMINAL QUESTIONS .......................................................................................... 128



COE Unit 2, Lesson 4



113
4. Repetitive tasks




It is important to appreciate the need of loops in scripts. It takes scripti ng to the next
level and comes very handy in a wide variety of applications.



4.0 Objectives

Upon completion of this chapter, you will be able to
Use for, while and until loops, and decide which loop fits which occasion.
Use the break and continue Bash builtins.
Write scripts usi ng the select statement.
Write scripts that take a variable number of arguments.


4.1 Introduction

This chapter teaches the student to write different types of loops as per any
application that requires repetitive tasks. This is very helpful i n writing useful
scripts that require something to be done repeatedly.


4.2 The for loop

4.2.1 How does it work?

The for loop is the first of the three shell looping constructs. This loop allows
for specification of a list of values. A list of commands is executed for each
value in the list.

The syntax for this loop is:

for NAME [in LIST ]; do COMMANDS; done

If [in LIST] is not present, it is replaced with $@ and for executes the
COMMANDS once for each positional parameter that is set. The return status
is the exit status of the last command that executes. If no commands are
executed because LIST does not expand to any items, the return status is
zero.

NAME can be any variable name, although it is used very often. LIST can be
any list of words, stri ngs or numbers, which can be literal or generated by any
command. The COMMANDS to execute can also be any operati ng system
COE Unit 2, Lesson 4



114
commands, script, program or shell statement. The first time through the loop,
NAME is set to the first item in the LIST. The second time, its value is set to
the second item i n the list, and so on. The loop termi nates when NAME has
taken on each of the values from LIST and no items are left i n the LIST.

4.2.2 Examples

Using command substitution for specifying LIST items

The first is a command line example, demonstrati ng the use of a for loop that
makes a backup copy of each .xml fi le. After issui ng the command, it is safe
to start worki ng on your sources:













This one lists the files in /sbin that are just plain text files, and possibly scripts:


Using the content of a variable to specify LIST items

The following is a specific application script for converti ng HTML files,
compliant with a certain scheme, to PHP files. The conversion is done by
taking out the first 25 and the last 21 li nes, replacing these with two PHP tags
that provide header and footer li nes:












bash> ls *.xml
file1.xml fi le2.xml file3.xml

bash> ls *.xml > list

bash> for i in `cat list`; do cp "$i" "$i".bak ; done

bash> ls *.xml*
file1.xml file1.xml.bak file2.xml fi le2.xml.bak file3.xml
file3.xml.bak



#!/bin/bash
# specific conversion script for my html files to php
LIST="$(ls *.html)"
for i i n "$LIST"; do
NEWNAME=$(ls "$i" | sed e 's/html/php/' )
cat beginfi le > "$NEWNAME"
cat "$i" | sed e '1,25d' | tac | sed e '1,21d' | tac >>
"$NEWNAME"
cat endfile >> "$NEWNAME"
done

html2php.sh

for i i n `ls /sbin`; do file /sbin/$i | grep ASCII;done

COE Unit 2, Lesson 4



115
Since we don' t do a li ne count here, there is no way of knowing the line
number from which to start deleting lines until reaching the end. The problem
is solved using tac, which reverses the li nes in a file.



4.3 The while loop

4.3.1 What is it?

The while construct allows for repetitive execution of a list of commands, as
long as the command controlli ng the while loop executes successfully (exit
status of zero). The syntax is:

while CONTROLCOMMAND; do CONSEQUENTCOMMANDS; done

CONTROLCOMMAND can be any command(s) that can exit with a success
or failure status. The CONSEQUENTCOMMANDS can be any program,
script or shell construct.

As soon as the CONTROLCOMMAND fails, the loop exits. In a script, the
command following the done statement is executed.

The return status is the exit status of the last CONSEQUENTCOMMANDS
command, or zero if none was executed.

4.3.2 Examples

Simple example using while

Here is an example for the impatient:














Nested while loops

The example below was written to copy pictures that are made with a webcam
to a web directory. Every fi ve mi nutes a picture is taken. Every hour, a new
directory is created, holdi ng the images for that hour. Every day, a new
#!/bin/bash

# This script opens 4 termi nal wi ndows.

i="0"

whi le [ $i lt 4 ]
do
xterm &
i=$[$i+1]
done


COE Unit 2, Lesson 4



116
directory is created containi ng 24 subdirectories. The script runs i n the
background.


Note the use of the true statement. This means: continue execution until we
are forcibly interrupted (with kill or Ctrl+C).

This small script can be used for simulation testi ng; it generates files:












Note the use of the date command to generate all kinds of file and directory
names. See the man page for more i nformation on date command

Calculating an average

#!/bin/bash

# This script copies files from my homedirectory i nto the
webserver directory.
# (use scp and SSH keys for a remote directory)
# A new directory is created every hour.

PICSDIR=/home/mohan/pics
WEBDIR=/var/www/mohan/webcam

whi le true; do
DATE=`date +%Y%m%d`
HOUR=`date +%H`
mkdir $WEBDIR/"$DATE"

while [ $HOUR ne "00" ]; do
DESTDIR=$WEBDIR/"$DATE"/"$HOUR"
mkdir "$DESTDIR"
mv $PICDIR/*.jpg "$DESTDIR"/
sleep 3600
HOUR=`date +%H`
done
done

#!/bin/bash

# This generates a fi le every 5 mi nutes

whi le true; do
touch pic`date +%s`.jpg
sleep 300
done

COE Unit 2, Lesson 4



117
This script calculates the average of user i nput, which is tested before it is
processed: if i nput is not within range, a message is pri nted. If q is pressed,
the loop exits:





























Note how the variables i n the last lines are left unquoted i n order to do
arithmetic.

4.4 The until loop

4.4.1 What is it?

The until loop is very similar to the while loop, except that the loop executes
until the TESTCOMMAND executes successfully. As long as this command
fails, the loop continues. The syntax is the same as for the while loop:

until TESTCOMMAND; do CONSEQUENTCOMMANDS; done

The return status is the exit status of the last command executed in the
CONSEQUENTCOMMANDS list, or zero if none was executed.
TESTCOMMAND can, again, be any command that can exit with a success
or failure status, and CONSEQUENTCOMMANDS can be any UNIX
command, script or shell construct.
#!/bin/bash

# Calculate the average of a series of numbers.

SCORE="0"
AVERAGE="0"
SUM="0"
NUM="0"

whi le true; do

echo n "Enter your score [0100%] ('q' for quit): ";
read SCORE;
if (("$SCORE" < "0")) || (("$SCORE" > "100")); then
echo "Be serious. Common, try again: "
elif [ "$SCORE" == "q" ]; then
echo "Average rati ng: $AVERAGE%."
break
else
SUM=$[$SUM + $SCORE]
NUM=$[$NUM + 1]
AVERAGE=$[$SUM / $NUM]
fi
done

echo "Exiting."

COE Unit 2, Lesson 4



118

As was previously explained, the ";" may be replaced with one or more
newlines wherever it appears.


4.4.2 Example

An improved picturesort.sh script (see Section 4.2.2.2), which tests for
available disk space. If disk space is not enough, remove pictures from the
previous months:



Note the initiali zation of the HOUR and DISKFUL variables and the use of
options with ls and date in order to obtain a correct listing for TOREMOVE.
(Not Clear)

4.5 I/O redirection and loops
#!/bin/bash

# This script copies files from my
homedirectory into the webserver directory.
# A new directory is created every hour.
# If the pics are taki ng up too much space,
the oldest are removed.

whi le true; do
DISKFUL=$(df h $WEBDIR | grep v File |
awk '{print $5}' | cut d "%" f1 )

until [ $DISKFUL ge "90" ]; do
DATE=`date +%Y%m%d`
HOUR=`date +%H`
mkdir $WEBDIR/"$DATE"

whi le [ $HOUR ne "00" ]; do

DESTDIR=$WEBDIR/"$DATE"/"$HOUR"
mkdir "$DESTDIR"
mv $PICDIR/*.jpg "$DESTDIR"/
sleep 3600
HOUR=`date +%H`
done
DISKFULL=$(df h $WEBDIR | grep v
File | awk '{ print $5 }' | cut d "%" f1 )
done

TOREMOVE=$(find $WEBDIR type d a
mtime +30)
for i in $TOREMOVE; do
rm rf "$i";
done

done

COE Unit 2, Lesson 4



119

4.5.1 Input redirection

Instead of controlli ng a loop by testi ng the result of a command or by user
input, you can specify a file from which to read input that controls the loop. In
such cases, read is often the controlling command. As long as input lines are
fed into the loop, execution of the loop commands continues. As soon as, all
the input li nes are read the loop exits.

Since the loop construct is considered to be one command structure (such as
while TESTCOMMAND; do CONSEQUENTCOMMANDS; done), the
redirection should occur after the done statement, so that it complies with the
form

command < file

This kind of redirection also works with other ki nds of loops.

4.5.2 Output redirection

In the example below, output of the find command is used as i nput for the
read command controlling a while loop:




Files are compressed by gzip command before they are moved into the
archi ve directory.

4.6 Break and continue


#!/bin/bash

# This script creates a subdirectory in the current
directory, to which old
# files are moved.
# Might be something for cron (if slightly adapted) to
execute weekly or
# monthly.

ARCHIVENR=`date +%Y%m%d`
DESTDIR="$PWD/archi ve$ARCHIVENR"

mkdir $DESTDIR

find $PWD type f a mtime +5 | while read file
do
gzip "$file"; mv "$file".gz "$DESTDIR"
echo "$file archi ved"
done


archiveoldstuff.sh

COE Unit 2, Lesson 4



120

4.6.1 The break builtin

The break statement is used to exit the current loop before its normal ending.
This is done when you don' t know i n advance how many times the loop will
have to execute, for i nstance because it is dependent on user input.

The example below demonstrates a while loop that can be i nterrupted. This is
a slightly improved version of the wisdom.sh script from Section 4.3.2






















#!/bin/bash

# This script provides wisdom
# You can now exit in a decent
way.

FORTUNE=/usr/games/fortune

whi le true; do
echo "On which topic do you want
advice?"
echo "1. politics"
echo "2. startrek"
echo "3. kernelnewbies"
echo "4. sports"
echo "5. bofhexcuses"
echo "6. magic"
echo "7. love"
echo "8. literature"
echo "9. drugs"
echo "10. education"
echo

echo n "Enter your choice, or 0 for
exit: "
read choice
echo

case $choice in
1)
$FORTUNE politics
;;
2)
$FORTUNE startrek
;;
3)
$FORTUNE
kernelnewbies
;;
4)
echo "Sports are a waste
of time, energy and money."
echo "Go back to your
keyboard."
COE Unit 2, Lesson 4



121































Mind that break exits the loop, not the script. This can be demonstrated by
adding an echo command at the end of the script. This echo will also be
executed upon i nput that causes break to be executed (when the user types
"0"). In nested loops, break allows for specification of which loop to exit. See
the Bash info pages for more.

4.6.2 The continue builtin

The continue statement resumes iteration of an enclosi ng for, while, until or
select loop.
When used i n a for loop, the controlli ng variable takes on the value of the next
element i n the list. When used in a while or until construct, on the other hand,
execution resumes with TESTCOMMAND at the top of the loop.

4.6.3 Examples

In the followi ng example, file names are converted to lower case. If no
conversion needs to be done, a continue statement restarts execution of the
loop. These commands don't eat much system resources, and most likely,
5)
$FORTUNE bofhexcuses
;;
6)
$FORTUNE magic
;;
7)
$FORTUNE love
;;
8)
$FORTUNE literature
;;
9)
$FORTUNE drugs
;;
10)
$FORTUNE education
;;
0)
echo "OK, see you!"
break
;;
*)
echo "That is not a valid choice, try a
number from 0 to 10."
;;
esac
done


COE Unit 2, Lesson 4



122
similar problems can be solved usi ng sed and awk. However, it is useful to
know about this kind of construction when executing heavy jobs, that might
not even be necessary when tests are i nserted at the correct locations in a
script, spari ng system resources.




This script has at least one disadvantage: it overwrites existing files. The
noclobber option to Bash is only useful when redirection occurs. The b
option to the mv command provides more security, but is only safe i n case of
one accidental overwrite, as is demonstrated i n this test:




#!/bin/bash

# This script converts all file names containi ng upper case
characters i nto
file

# names containi ng LIST="$(ls)"

for name in "$LIST"; do

if [[ "$name" != *[[:upper:]]* ]]; then
conti nue
fi

ORIG="$name"
NEW=`echo $name | tr 'AZ' 'az'`

mv "$ORIG" "$NEW"
echo "new name for $ORIG is $NEW"
done

tolower.sh

bash> rm *

bash> touch test Test TEST

bash> bash x tolower.sh
++ ls
+ LIST=test
Test
TEST
+ [[ test != *[[:upper:]]* ]]
+ conti nue
+ [[ Test != *[[:upper:]]* ]]
+ ORIG=Test
COE Unit 2, Lesson 4



123











The tr is part of the textutils package; it can perform all kinds of character
transformations.


4.7 Making menus with the select builtin

4.7.1 General

Use of select

The select construct allows easy menu generation. The syntax is quite similar
to that of the for loop:

select WORD [in LIST]; do RESPECTIVECOMMANDS; done

LIST is expanded, generating a list of items. The expansion is printed to
standard error; each item is preceded by a number. If in LIST is not present,
the positional parameters are printed, as if in $@ would have been specified.
LIST is only pri nted once.

Upon printing all the items, the PS3 prompt is printed and one line from
standard input is read. If this li ne consists of a number correspondi ng to one
of the i tems, the value of WORD is set to the name of that item. If the li ne is
empty, the items and the PS3 prompt are displayed agai n. If an EOF (End Of
++ echo TEST
++ tr AZ az
+ NEW=test
+ mv b TEST test
+ echo ' new name for TEST is test'
new name for TEST is test

bash> ls a
./ ../ test test~

COE Unit 2, Lesson 4



124
File) character is read, the loop exits. Since most users don't have a clue
which key combination is used for the EOF sequence, it is more userfriendly
to have a break command as one of the items. Any other value of the read
line will set WORD to be a null string.

The read li ne is saved i n the REPLY variable.
The RESPECTIVECOMMANDS are executed after each selection until the
number representi ng the break is read. This exits the loop.

Examples

This is a very simple example, but as you can see, it is not very userfriendly:
COE Unit 2, Lesson 4



125




























Setting the PS3 prompt and adding a possibility to quit makes it better:























#!/bin/bash

echo "This script can make any of the files in this directory
private."
echo "Enter the number of the file you want to protect:"

select FILENAME in *;
do
echo "You picked $FILENAME ($REPLY), it is now
only accessible to you."
chmod gorwx "$FILENAME"
done

bash>./private.sh
This script can make any of the fi les in this directory
private.
Enter the number of the file you want to protect:
1) archive20030129
2) bash
3) private.sh
#? 1
You picked archive20030129 (1)
#?

#!/bin/bash

echo "This script can make any of the files in this directory
private."
echo "Enter the number of the file you want to protect:"

PS3="Your choice: "
QUIT="QUIT THIS PROGRAM I feel safe now."
touch "$QUIT"

select FILENAME in *;
do
case $FILENAME in
"$QUIT")
echo "Exiting."
break
;;
*)
echo "You picked $FILENAME ($REPLY)"
chmod gorwx "$FILENAME"
;;
esac
done
rm "$QUIT"

private.sh

COE Unit 2, Lesson 4



126
4.7.2 Submenus

Any statement withi n a select construct can be another select loop, enabling
(a) submenu(s) withi n a menu.

By default, the PS3 variable is not changed when entering a nested select
loop. If you want a different prompt in the submenu, be sure to set it at the
appropriate time(s).


4.8 The shift builtin

4.8.1 What does it do?

The shift command is one of the Bourne shell bui lti ns that comes with Bash.
This command takes one argument, a number. The positional parameters are
shifted to the left by this number, N. The positional parameters from N+1 to $#
are renamed to variable names from $1 to $# N+1. Say you have a
command that takes 10 arguments, and N is 4, then $4 becomes $1, $5
becomes $2 and so on. $10 becomes $7 and the original $1, $2 and $3 are
thrown away.
If N is zero or greater than $# (the total number of arguments, see Section
7.2.1.2). If N is not present, it is assumed to be 1. The return status is zero
unless N is greater than $# or less than zero; otherwise it is nonzero.

4.8.2 Examples

A shift statement is typically used when the number of arguments to a
command is not known i n advance, for instance when users can gi ve as many
arguments as they like. In such cases, the arguments are usually processed
in a while loop with a test condition of (($# )). This condition is true as long as
the number of arguments is greater than zero. The $1 variable and the shift
statement process each argument. The number of arguments is reduced each
time shift is executed and eventually becomes zero, upon which the while
loop exits.

The example below, cleanup.sh, uses shift statements to process each file i n
the list generated by find:

COE Unit 2, Lesson 4



127


The above find command can be replaced with the following:

find options | xargs [commands_to_execute_on_found_files]

The xargs command builds and executes command lines from standard input.
This has the advantage that the command line is filled unti l the system limit is
reached. Only then will the command to execute be called, i n the above
example this would be rm. If there are more arguments, a new command li ne
will be used, until that one is full or until there are no more arguments. The
same thing using find exec calls on the command to execute on the found
files every time a file is found. Thus, using xargs greatly speeds up your
scripts and the performance of your machi ne.


4.9 Summary

In this chapter, we discussed how repetiti ve commands can be i ncorporated
in loop constructs. Most common loops are built usi ng the for, while or until
statements, or a combi nation of these commands. The for loop executes a
task a defi ned number of times. If you don' t know how many times a
command should execute, use either until or while to specify when the loop
should end.

Loops can be interrupted or reiterated usi ng the break and continue
statements. A file can be used as i nput for a loop using the input redirection
#!/bin/bash

# This script can clean up fi les that were last accessed
over 365 days ago.

USAGE="Usage: $0 dir1 dir2 dir3 ... dirN"

if [ "$#" == "0" ]; then
echo "$USAGE"
exit 1
fi

whi le (( "$#" )); do

if [[ "$(ls $1)" == "" ]]; then
echo "Empty directory, nothi ng to be done."
else
fi nd $1 type f a atime +365 exec rm i {} \;
fi

shift

done

COE Unit 2, Lesson 4



128
operator, loops can also read output from commands that is fed into the loop
using a pipe.

The select construct is used for pri nti ng menus i n i nteractive scripts. Looping
through the command line arguments to a script can be done using the shift
statement.



Self-check Questions

1. What is the use of Loops?
2. List the different types of Loops in shell?
3. What is the use of the "break" statement?
4. What will the followi ng construct do and why?
while [ 5 ]




4.10 Answers to the Self-Check questions

1. Loops let the user perform a set of i nstructions repeatedly.
2. For, While and Unti l.
3. The break statement is used to exit the current loop before its normal ending.
4. This sets up an infinite loop si nce a value greater than 0 is considered to be
true.


4.11 Terminal Questions

1. How would you decide which type of loop to use?
2. Explai n why it is so important to put the variables in between double quotes i n
the example from Section 4.4.2?
3. Describe the shift built-i n command.
4. There are at least 6 syntactical mistakes in the following program. Locate
them.
COE Unit 2, Lesson 4



129




















1 ppprunning = yes
2 whi le $ppprunning = yes ; do
3 echo INTERNET MENU\n
4 1. Dial out
5 2. Exit
6
7 Choice:
8 read choice
9 case choice i n
10 1) i f [ -z $ppprunning ]
11 echo Enter your username and
password
12 else
13 chat.sh
14 endif ;
15 *) ppprunning=no
16 endcase
17 done

COE
Unit 2, Lesson 5



LESSON 5 REGULAR EXPRESSIONS

5. REGULAR EXPRESSIONS......................................................................................... 133
5.0 OBJECTIVES .......................................................................................................... 133
5.1 INTRODUCTION ...................................................................................................... 133
5.2 REGULAR EXPRESSIONS ....................................................................................... 133
5.2.1 What are regular expressions? .................................................................... 133
5.2.2 The Structure of a Regular Expression....................................................... 134
5.2.3 Regular expression metacharacters ........................................................... 135
5.2.4 Creating complex regular expressions by concatenating other regEx .. 136
5.2.5 Using metacharacters on regEx to create complex regEx ..................... 136
5.3 THE GREP .............................................................................................................. 137
5.3.1 Grep and regular expressions ...................................................................... 138
5.4 PATTERN MATCHING USING SHELL........................................................................ 140
5.4.1 Character ranges............................................................................................ 140
5.4.2 Character classes........................................................................................... 141
5.5 SUMMARY .............................................................................................................. 141
5.6 ANSWERS TO THE SELF-CHECK QUESTIONS........................................................ 142
5.7 TERMINAL QUESTIONS .......................................................................................... 142


COE
Unit 2, Lesson 5



133
5. Regular Expressions




Regular expressions are very helpful i n creating powerful scripts. Regular
expressions are also used heavily in advanced Unix utilities that we will be studying
further, like sed, AWK and perl language.



5.0 Objectives

After goi ng through this lesson, you will learn about:

Using regular expressions
Regular expression metacharacters
Finding patterns i n files or output
Character ranges and classes i n Bash


5.1 Introduction

This chapter introduces the concept of regular expressions. A regular
expression is a pattern that describes a set of stri ngs. This is a very powerful
concept and can be used effecti vely in scripti ng.


5.2 Regular expressions

5.2.1 What are regular expressions?

Often you wi ll encounter conditions where you need to match specific patterns
in scripts. For example, given a list of cricket players you may need to find out
all those players whose names begi n with A or B. In other words, you need to
match with a pattern set. A regular expression helps you defi ne a pattern
space in a terse way. For example, if you want to match any number where
no other digit used other than 9 (e.g., 9, 99, 999, 9999, ), then it is
impossible to write out the entire pattern set. But a regular expression can
express the same set very easily. Lets see what is a regular expression and
how are they used.

Here are few examples of regular expressions. You will begin to understand
how they represent their patter set as you study this chapter.



9* => Any number that contains only digit 9 (e.g., 99, 9999,
etc.)
India.* => Any string beginning with India (e.g., India, Indian,
Indiana, etc.)
COE
Unit 2, Lesson 5



134

A regular expression is a sequence of characters that represents patterns.
The pattern can be a simple word, like, India, or can describe more general
set of patterns like India, Indian, Indiana, etc. Using regular expression
you can create general patterns like any 3 digit number that does not contai n
the digit 2.

What is meant by regular i n the term regular expression? The term regular
refers to the fact that there is a pre-defi ned repetition that it denotes. If the
repetitions are irregular, then you cannot denote the pattern with a regular
expression. For example, a set of all the prime numbers cannot be denoted
using a regular expression!

What is meant by expression in the term regular expression? The
expression in regular expression refers to the fact that, just like
mathematical expressions, regular expressions can be combined together to
form new and more complex regular expression.

By the way, regular expressions are often referred to as regEx by developers.

5.2.2 The Structure of a Regular Expression

All si ngle characters, i ncluding characters like a, =, 3, etc., are fundamental
regular expressions. They match the si ngle character they represent. Most
characters, including all letters and digits, are regular expressions that match
themselves. The fundamental regular expressions can be combi ned to create
more complex regular expressions. Lets see how we create more complex
regEx.

There are three important parts to a regular expression:
Anchors
Character sets
Modifiers

Anchors are used to specify the position of the pattern i n relation to a line of
text.

Character Sets match one or more characters i n a si ngle position.

Modifiers specify how many times the previous character set is repeated.

A simple example that demonstrates all three parts is the regular expression
"^#*." The up arrow is an anchor that i ndicates the beginning of the line. The
character "#" is a simple character set that matches the single character "#".
The asterisk is a modifier. In a regular expression asterisk specifies that the
character set can appear any number of times.
COE
Unit 2, Lesson 5



135

5.2.3 Regular expression metacharacters

There are few special characters that specify repetition styles for the
preceding character or the preceding expression. These special characters
that denote the repetition types are called Meta Characters.

The table below lists various metacharacters and their meani ngs.

Table Regular expression metacharacters

Operator Effect
. (single dot) Matches any si ngle character
?
The preceding item is optional and will be matched,
at most, once.
*
The preceding item will be matched zero or more
times.
+
The preceding item will be matched one or more
times.
{N} The preceding item will be matched exactly N times.
{N,}
The preceding item will be matched exactly N or
more times.
{N,M}
The preceding item will be matched at least N times,
but not more than M times.
-
Represents the range if its not first or last in a list or
the ending point of a range in a list.
^
Matches the empty stri ng at the beginning of a li ne;
also represents the characters in the range of a list.
$ Matches the empty stri ng at the end of a line.
\b Matches the empty stri ng at the edge of a word.
\B
Matches the empty stri ng provided its not at the
edge of word.
\< Match the empty string at the beginning of a word.
\> Match the empty string at the end of word.











In the example below, the * indicates zero or more
repetitions of 9.
9* => Any number that contai ns only digit 9 (e.g., 99,
9999, etc.)

In the example below, the . indicates any character and
therefore .* indicates any number of repetitions of any
characters.
India.* => Any string beginning with India (e.g., India,
Indian, Indiana, etc.)

So, for example, India.* wi ll also match India123,
IndiaZZZ, etc.
COE
Unit 2, Lesson 5



136

5.2.4 Creating complex regular expressions by concatenating other regEx

Suppose you want to use a regular expression to match any string i n which
letter A repeats one or number of times. (e.g., A, AA, AAA, etc.). Then the
regular expression for this is




Now suppose you want to use a regular expression to match any stri ng i n
which the digit 4 repeat any number of times.




Now, suppose you want to create a regular expression to match any string i n
which first the letter A repeats one ore more number of times and then the
digit 4 repeats any number of times (e.g., A4, A444, AA4, etc). So, you can
combine the regular expression created earlier:





5.2.5 Using metacharacters on regEx to create complex regEx

Now, suppose you want to create a regular expression that denotes an
unsigned real number. You can use the following regEx for it:




Lets dissect this example to understand better:

First [0-9]+ will match one or more occurrence of a digit.

To make the fractional part, we need to allow a dot (e.g., dot in .32) . So we
have \. there.

The fractional part, if present needs to again have at least one digit, so have
the complete fractional part written as \.[[0-9]+ there.

However we need to make sure that the fractional part should be optional (it
should match numbers without the fractional parts too). So, the fractional part
is made optional by putti ng a question mark for it. Thus making the entire
regEx as [0-9]+(\.[0-9]+)?

A+ => will match A, AA, AAA, etc. but wi ll not match
empty string.
4* => will match 4, 44, 444, etc. and wi ll also match an
empty string.
A+4* => will match A4, A44, AA4, etc. but will not match
4AA.
[0-9]+(\.[0-9]+)? => will match 4, 0.32, 4, etc. but will
not match -5, .33 or 7e-3.
COE
Unit 2, Lesson 5



137

5.3 The grep command

Uni x has a command to that performs regular expressions based search. This
command is called grep. grep searches the input for lines containi ng a match
to a gi ven pattern list. When it finds a match in a line, it prints the li ne.

Note that grep command does not match patterns across multiple lines.
Here are few examples on grep.
























With the first command, user displays the lines from /etc/passwd contai ning
the stri ng root. Then displays the li ne numbers containi ng this search string.

With the third command the user checks which users are not usi ng bash, but
accounts with the nologin shell are not displayed.

Then the user counts the number of accounts that have /bi n/false as the shell.

The last command displays the lines contini ng root or Root or ROOT, etc..

Now let's see what else we can do with grep, usi ng regular expressions.
bash> grep root /etc/passwd
root:x : 0 : 0 : root:/root:/bin/bash
operator:x : 11 : 0 : operator:/root:/sbin/nologin

bash> grep n root /etc/passwd # prints line
numbers of matches
1: root:x : 0 : 0 : root:/root:/bin/bash
12 : operator:x : 11 : 0 : operator:/root:/sbin/nologin

bash> grep v bash /etc/passwd | grep v nologin #
matching reverted
sync : x : 5 : 0 : sync : /sbin:/bin/sync
shutdown : x : 6 : 0 : shutdown : /sbin:/sbi n/shutdown
halt:x : 7: 0 : halt:/sbi n:/sbin/halt
news : x : 9 : 13 : news : /var/spool/news:
apache : x : 48 : 48 : Apache : /var/www : /bin/false

bash> grep c false /etc/passwd # returns number of
matches
7
bash> grep i root /etc/passwd # match regardless of
the case
Root:0:0:/root
root:0:0:/sysadm
COE
Unit 2, Lesson 5



138

5.3.1 Grep and regular expressions

a. Line and word anchors

From the previous example, we now exclusi vely want to display li nes starti ng
with the stri ng "root":




If we want to see which accounts have no shell assigned whatsoever, we
search for li nes ending in ":":




To check that PATH is exported i n ~/.bashrc, first select "export" lines and
then search for lines starting with the string "PATH", so as not to display
MANPATH and other possible paths:





If you want to fi nd a stri ng that is a separate word (enclosed by spaces), it is
better to use the w, as i n this example where we are displaying information
for the root partition:









If this option is not used, all the lines from the file system table wi ll be
displayed.


b. Character classes

A bracket expression is a list of characters enclosed by "[" and "]". It matches
any si ngle character in that list; if the first character of the list is the caret, "^",
then it matches any character NOT in the list. For example, the regular
expression "[0123456789]" matches any single digit. You can also write it like
[0-9].

bash> grep ^root /etc/passwd
root:x:0:0:root:/root:/bi n/bash

bash> grep :$ /etc/passwd
news:x:9:13:news:/var/spool/news:

bash> grep export ~/.bashrc | grep ' \<PATH'
export
PATH="/bi n:/usr/lib/mh:/lib:/usr/bi n:/usr/local/bin:/usr/ucb:/
usr/dbin:$PATH"

bash> cat myFile.txt
Neil Armstrong was the first man to walk on the moon.
He had said, this is a small step for me but a huge step
for manki nd.

bash> grep w man myFile.txt
Neil Armstrong was the first man to walk on the moon.

Note here that the other li ne is not matched because
manki nd is a si ngle word hence will not match for the word
man because w option is used.
COE
Unit 2, Lesson 5



139
Within a bracket expression, a range expression consists of two characters
separated by a hyphen. It matches any si ngle character that sorts between
the two characters, inclusive, using the locale's collati ng sequence and
character set. For example, i n the default C locale, "[ad]" is equivalent to
"[abcd]". Many locales sort characters i n dictionary order, and i n these locales
"[ad]" is typically not equivalent to "[abcd]"; it might be equi valent to
"[aBbCcDd]", for example. To obtain the traditional interpretation of bracket
expressions, you can use the C locale by setti ng the LC_ALL environment
variable to the value "C".

Finally, certain named classes of characters are predefined within bracket
expressions. See the grep man or info pages for more i nformation about
these predefined expressions.















In the example, all the lines containi ng either a "y" or "f" character are first
displayed, followed by an example of using a range with the ls command.

c. Wildcards

Use the "." for a si ngle character match. If you want to get a list of all
fivecharacter English dictionary words starting with "c" and ending in "h"
(handy for solving crosswords):












If you want to display li nes containing the literal dot character, use the F
option to grep.
bash> grep [yf] /etc/group
sys:x : 3 : root,bin,adm
tty : x : 5 :
mail : x : 12 :mail,postfi x
ftp : x : 50 :
nobody : x : 99 :
floppy:x : 19 :
xfs : x : 43 :
nfsnobody : x : 65534 :
postfix : x : 89 :

bash> ls *[19].xml
app1.xml chap1.xml chap2.xml chap3.xml chap4.xml

bash> grep '\<c...h\>' /usr/share/dict/ words
catch
clash
cloth
coach
couch
cough
crash
crush

COE
Unit 2, Lesson 5



140

For matchi ng multiple characters, use the asterisk. This example selects all
words starti ng with "c" and ending i n "h" from the system's dictionary:



5.4 Pattern matching using shell

5.4.1 Character ranges

Apart from grep and regular expressions, there's a good deal of pattern
matching that you can do directly i n the shell, without having to use an
external program.
As you already know, the asterisk (*) and the question mark (?) match any
string or any single character, respectively. Quote these special characters to
match them literally:





But you can also use the square braces to match any enclosed character or
range of characters, if pairs of characters are separated by a hyphen. An
example:



Lists all files in radha's home directory, starting with "a", "b", "c", "x", "y" or "z".

If the first character within the braces is "!" or "^", any character not enclosed
will be matched. To match the dash (""), include it as the first or last
character i n the set. The sorting depends on the current locale and of the
value of the LC_COLLATE variable, if it is set. Mi nd that other locales mi ght
interpret "[acxz]" as "[aBbCcXxYyZz]" if sorting is done in dictionary order.
If you want to be sure to have the traditional interpretation of ranges, force this
behavior by setting LC_COLLATE or LC_ALL to "C".
bash> ls ld [acxz]*
drwxrxrx 2 radha radha 4096 Jul 20
2002 appdefaults/
drwxrwxrx 4 radha radha 4096 May 25
2002 arabic/
drwxrwxrx 2 radha radha 4096 Mar 4
18:30 bin/
drwxrxrx 7 radha radha 4096 Sep 2
2001 crossover/
drwxrwxrx 3 radha radha 4096 Mar 22
2002 xml
bash> grep ' \<c.*h\>' /usr/share/dict/ words
caliph
cash
catch
cheesecloth
cheetah

bash> ls "*"
This will not list all the files. It will list the file named *.
COE
Unit 2, Lesson 5



141

5.4.2 Character classes

Character classes can be specified withi n the square braces, using the syntax
[:CLASS:], where CLASS is defined i n the POSIX standard and has one of the
values
"alnum", "alpha", "ascii", "blank", "cntrl", "digit", "graph", "lower", "print",
"punct", "space", "upper", "word" or "xdigit".











When the extglob shell option is enabled (usi ng the shopt bui lti n), several
extended pattern matchi ng operators are recognized.


5.5 Summary

Regular expressions are powerful tools for selecti ng particular lines from fi les
or output. A lot of UNIX commands use regular expressions: vim, perl, the
PostgreSQL database and so on. They can be made available in any
language or application using external libraries, and they even found their way
to nonUNIX systems. For instance, regular expressions are used in the
Excell spreadsheet that comes with the MicroSoft Wi ndows Office suite. In
this chapter we got the feel of the grep command, which is indispensable i n
any UNIX environment.

Bash has builtin features for matching patterns and can recogni ze character
classes and ranges.



Self-check Questions

1. What are regular expressions
2. What will be the result of ls -l | grep '^.....w'
3. What does the expression gg* signify?
4. How do you locate lines in a file foo contai ning ram and raman using grep?




bash> ls ld [[:digit:]]*
drwxrwxrx 2 radha radha 4096 Apr 20 13:45
2/

bash> ls ld [[:upper:]]*
drwxrwxr 3 radha radha 4096 Sep 30 2001
Nautilus/
drwxrwxrx 4 radha radha 4096 Jul 11 2002
OpenOffice.org1.0/
rwrwr 1 radha radha 997376 Apr 18
15:39 Schedule.sdc

COE
Unit 2, Lesson 5



142
5.6 Answers to the Self-Check questions

1. A regular expression is a pattern that describes a set of strings.
2. This locates all files which have write permission for the group (e.g. drwxrw-r-
x)
3. One or more occurrences of g.
4. Use grep rama*n* foo


5.7 Terminal Questions

1. Describe the structure of a regular expression.
2. Describe some regular expression operators.
3. What is the difference between a wild card and a regular expression?
4. What is the difference between basic and extended regular expression






UNIT 3: Advanced Shell Scripting, sed, and
awk

1. FUNCTIONS IN SHELL SCRIPTS .................................................................. 147
2. SED STREAM EDITOR .................................................................................... 159
3. AWK BASICS ........................................................................................................... 169
4. AWK PROGRAMMING ........................................................................................ 177


COE Unit 3, Lesson 1

LESSON 1 FUNCTIONS IN SHELL SCRIPTS

1. FUNCTIONS IN SHELL SCRIPTS ............................................................................. 147
1.0 OBJECTIVES .......................................................................................................... 147
1.1 INTRODUCTION TO SHELL FUNCTIONS................................................................... 147
1.1.1 When to use functions? ................................................................................. 147
1.1.2 Benefits of using functions ............................................................................ 149
1.1.3 Where you cannot create functions?........................................................... 150
1.2 WRITING A SHELL FUNCTION ................................................................................. 150
1.2.1 Function header.............................................................................................. 150
1.2.2 Function body ................................................................................................. 151
1.2.3 Returning from a function.............................................................................. 152
1.2.4 Function arguments ....................................................................................... 152
1.2.5 IFS (internal field separators) ....................................................................... 153
1.2.6 Creating a utility library of shell functions ................................................... 154
1.2.7 Things to keep in mind while writing shell functions ................................. 154
1.3 SUMMARY .............................................................................................................. 155
1.4 ANSWERS TO THE SELF CHECK QUESTIONS ......................................................... 155
1.5 TERMINAL QUESTIONS .......................................................................................... 155


COE Unit 3, Lesson 1
147
1. Functions in Shell Scripts




So far you have learnt various unix commands, plumbing commands together usi ng
pipes and creati ng shell scripts for programmi ng to carry out useful and routine,
repetitive tasks. Shell scripting in uni x can never be complete without knowi ng how
to write and use functions.



1.0 Objectives

After goi ng through these lessons you will know
When to use functions in shell scripts
How to write and use functions i n shell scripts


1.1 Introduction to shell functions

Often there are few lines of code that need to be used at several places in the
shell scripts. For example, if you are creati ng a shell script that will read a 3
digit STD code and 7 digit phone number and you need to ensure that user
types in exactly 3 numeric characters for STD code and exactly 7 numeric
characters for phone number, then it will be better to create and use a
function instead of replicating the same code at multiple places.

A function is like a mini script. It can take parameters, can defi ne its own
variables, can return a value, etc. Unlike a scripts call, a function executes i n
the same shell. Functions in shell scripts look and work similar to functions i n
C language.

1.1.1 When to use functions?

Consider the example listed in 1.1 above, wi thout using functions:

COE Unit 3, Lesson 1
148
































Script 1

You wi ll fi nd that apart from the marked text below, the rest of the code is
repeated.
#!/bin/bash
stdOK=0
do
echo Please enter 3 digit STD code:
read std
chkSTD=`echo $std | grep ^[0-9][0-9][0-
9]$`
if [ $chkSTDX != X ]; then
stdOK=1
else
echo Please enter exactly 3 digit STD
code here
fi
whi le [ $stdOK neq 1 ]

phoneOK=0
do
echo Please enter 7 digit phone number:
read phone
chkPH=`echo $phone | grep ^[0-9][0-9][0-
9][0-9][0-9][0-9][0-9]$`
if [ $chkPHX != X ]; then
phoneOK=1
else
echo Please enter exactly 7 digit phone
number here
fi
whi le [ $phoneOK neq 1 ]
callup $std $phone
COE Unit 3, Lesson 1
149







Script 2

See how much simpler it would be if you had a function that got you the
desired numbers!!









Script 3

1.1.2 Benefits of using functions

Functions provide several benefits as li sted below:

#!/bin/bash

stdOK=0
do
echo Please enter 3 digit STD code:
read std
chkSTD=`echo $std | grep ^[0-9][0-9][0-
9]$`
if [ $chkSTDX != X ]; then
stdOK=1
else
echo Please enter exactly 3 digit STD
code here
fi
whi le [ $stdOK neq 1 ]

phoneOK=0
do
echo Please enter 7 digit phone number:
read phone
chkPH=`echo $phone | grep ^[0-9][0-9][0-
9][0-9][0-9][0-9][0-9]$`
if [ $chkPHX != X ]; then
phoneOK=1
else
echo Please enter exactly 7 digit phone
number here
fi
whi le [ $phoneOK neq 1 ]
callup $std $phone
#!/bin/bash

std=`getNumber 3 STD code`
phone=`getNumber 7 phone number
callup $std $phone

COE Unit 3, Lesson 1
150
Functions simplify and modularize your scripts. Your scripts become better
readable (compare script 1 and script 3 above).
Modularize scripts are easier to mai ntain and enhance.
Functions provide you easier debugging.
Once you enhance a function, the enhanced effect is automatically available
at all places where the function is used.
You can even create a utility file containi ng functions and source it in your
other scripts so that uti lity functions are directly available for use, instead of
writing them over and over again.

1.1.3 Where you cannot create functions?

Be aware that not all shells provide support for functions. For example csh (C-
shell) does not provide support for functions.

But most other shells have this support, including sh (Bourne shell), ksh (korn
shell), tsh, bash (born agai n Bourne shell), etc.



Self Check Questions

1. When few li nes of code needs to be repeated at several places a ______ should
be created for it (select one):
a. script
b. program
c. function
2. A function helps i n improvi ng the script by maki ng it (select one or many as
apply):
d. more readable
e. more debug gable
f. modular
g. more mai ntainable




1.2 Writing a shell function

A shell function in bash has the following syntax. Text in bold i ndicates
keywords.

<yourFunctionName>() { <commands>; }
Or
function <yourFunctionName> { <commands>; }

1.2.1 Function header

COE Unit 3, Lesson 1
151
You can defi ne a function by usi ng the function keywords or you can define a
function by putting braces after the function name. For example:

Followi ng defi nes a function named aaa.






Followi ng defi nes a function named bbb.







Note that parameters to functions are not passed like C. Therefore, in
functions header you will not declare any parameters. See the definition of
the function bbb above. No parameters are ever listed withi n the braces.

1.2.2 Function body

Set of commands comprise of the function body.

A function can contai n any set of shell scripting commands, i ncludi ng flow
control commands like while and conditional commands like if, etc.
Commands can also contain calls to other functions and even other shell
scripts.

For example:














The above script uses the call to date shell command.



function aaa {
a = 1
}

bbb() {
a = 1
}

getDateString() {
echo Date format is dd/mm/yy ?:
read x
if [ $xX = yX ];then
str=`date +%dd%mm%yy`
else
str=`date +%yy%mm%dd`
fi
echo $str
}





COE Unit 3, Lesson 1
152
Self-Check Questions

3. The function keyword is must for writing a function (true/false).
4. You must declare arguments to a function in the function header (true/false).
5. You cannot declare arguments to a function i n the function header (true/false).
6. Function body can contain any of the shell commands (true/false).



1.2.3 Returning from a function

If your function reaches the end of its body and it has an echo command, it
echoes the return value. Alternati vely, you can return without completi ng the
execution of the function body by usi ng the return keyword.

For example:


















1.2.4 Function arguments

Parameters can be passed when calli ng a function by listi ng them in front of
the function. When inside the function, these parameters can be accessed as
shell variables, $1, $2, etc. Even $# (number of arguments passed) is
available inside the function.

Example:
aaa() {
a=1
b=2
echo $a
}

ret=`aaa` # ret will be 1

bbb() {
a=1
b=2
return $a
}

ret=`aaa` # ret will be 1

COE Unit 3, Lesson 1
153
















1.2.5 IFS (internal field separators)

You need to be careful while passi ng arguments to a function or a shell
command in a shell script. Shell interprets the values that you supply. As a
result, a stri ng passed as a parameter can get i nterpreted as multiple
parameters if it contains spaces.

For example:

paintObj greenish blue
Here you would expect to see $1 inside the function as greenish blue but
you will get $1 as greenish and $2 as blue.

You can tell shell to i nterpret newli ne as a field separator by declaring in your
script
IFS=
# Yes, the closi ng quote is on the next line!

Therefore, if you use the following:
IFS=

paintObj greenish blue

Now, here you will get $1 i nside the function as greenish blue.



Self-Check Questions

7. Parameters passed to a function are accessible using $1, $2, variables.
(true/false).
8. The $# i nside a function i ndicates the number of parameters passed to the script
(true/false).

addTwoNums() {
sum=0
sum=`expr $1 + $2`
return $sum
}

addAllNums() {
sum=0
if [ $1X = X ];then
return $sum
else
sum=`$sum + $1`
fi
}

COE Unit 3, Lesson 1
154

1.2.6 Creating a utility library of shell functions

When you create shell functions, you would typically want to make them
somewhat generic so that they can be reused i n other shell scripts as well. In
such cases, you can simply collect your shell functi ons into a si ngle file. Such
a file containing utility shell functions can be used as a library and can be
sourced i n other shell scripts.

For example:



1.2.7 Things to keep in mind while writing shell functions

Just like other shell commands, there are restrictions when writing shell
functions.
The starti ng curly bracket must be right on the same li ne as the function
header.
There must be spaces on both sides of curly brackets.
There must be either a semi colon or a new line before the closing curly
bracket.


bash>cat a_simple_utility_library.sh

#!/bin/sh
#--------------------------------- a_simple_utility_library
---------------------------
IFS=

# myecho function echoes the i nput and also
writes it into multiple files
myecho() {
for i in $FILE_LIST
do
echo $* >> $i
done
echo $*
}

mykill() {
pid=`ps ef | grep $1 | grep v grep | awk
{print $2}`
kill $pid
}
#------------------------- a_simple_utility_library ends
---------------------------



bash>cat my_application.sh

#!/bin/sh
. a_simple_utility_library.sh # The dot in the
beginni ng sources it
mykill junkjob # will kill the process runni ng
junkjob
COE Unit 3, Lesson 1
155
1.3 Summary

Functions help in modulari zing the scripts for repetitive tasks. If you use
functions, scripts become better readable and maintai nable.


1.4 Answers to the self check questions

1. (c)
2. all.
3. false.
4. false.
5. true.
6. false.
7. true.
8. false.


1.5 Terminal Questions

1. Discuss among your peers how functions are different from aliases.
2. Write a function that gets you a non-empty string.
3. Write a function that uses the function created i n assignment 2 above to read
and convert a stri ng into all uppercase.
4. Write a script that takes name, middle name and family name of a person and
prints them out in all uppercase or all lowercase depending on a shell
variables value.
5. Write a function that takes number of digits as an input and gets a number
contai ning those many digits. The function must check that user has to
provide a number.
6. Write a function that takes number of digits as an input and gets a number
contai ning at most those many digits.



COE Unit 3, Lesson 2

LESSON 2 SED STREAM EDITOR

2. SED STREAM EDITOR ............................................................................................ 159
2.0 OBJECTIVES .......................................................................................................... 159
2.1 INTRODUCTION TO SED ......................................................................................... 159
2.2 HOW SED OPERATES ............................................................................................. 159
2.3 SYNTAX OF THE SED COMMAND ............................................................................ 160
2.3.1 Options for the sed ......................................................................................... 160
2.4 COMMANDS IN SED................................................................................................ 161
2.4.1 Syntax of the commands in sed ................................................................... 162
2.5 SUMMARY .............................................................................................................. 164
2.6 ANSWERS TO THE SELF CHECK QUESTIONS ......................................................... 164
2.7 TERMINAL QUESTIONS .......................................................................................... 165


COE Unit 3, Lesson 2
159
2. SED Stream Editor




Sed (Stream editor) is a utility program available i n unix. sed is a powerful utility that
can be used to transform the i nput, li ne-by-line. sed is commonly used in scripti ng.



2.0 Objectives

After goi ng through these lessons you will know

What is sed? Its options and commands
What are regular expressions
Interacti ve use of sed
Using sed commands i n scripts


2.1 Introduction to sed

A Stream Editor is used to perform transformations on text read from a file or
a pipe. Sed sends the result to the standard output which can be redirected
and collected into another file, if needed.

Sed does not modify the original i nput file. Unlike other editors, vi and ed,
which are interacti ve editors, sed works on an i nput stream. Sed therefore is
suitable in scripts when you need text transformations, like i n conversion
programs.

For example: If you have a fi le where error is misspelt as erorr, you can
correct them by using sed command:
sed s/erorr/error/g myfile > myfile_corrected


2.2 How sed operates

It often comes handy to know how a utility works. Here is the detail on how
sed works:
A line of i nput is copied i nto a pattern space.
All editi ng commands in a sed script are applied in order to the copied line.
The copied (and modified) line is sent to standard output.
By default, sed works on all the lines of i nput. However, its scope can be
controlled by line addressing.
Editing commands are applied to all lines (globally) unless li ne addressing
restricts the lines affected.
COE Unit 3, Lesson 2
160
If a command changes the i nput, subsequent command-addresses wi ll be
applied to the current li ne i n the pattern space, not the original i nput line.


2.3 Syntax of the sed command

Sed can be i nvoked i n one of the following forms:

sed [options] 'command' file(s)
Or
sed [options] -f scriptfile file(s)

The first form allows you to specify an editi ng command on the command line,
surrounded by single quotes.

The second form allows you to specify a scriptfile, a file containing sed
commands. If no files are specified, sed reads from standard i nput.

2.3.1 Options for the sed

The e option
-e <script> option tells sed to add the commands in <script> to the set of
commands to run. You can give a series of commands using e option. For
example:

sed -e 's/a/A/' -e 's/b/B/' < oldFile >newFile

The f option
-f <scriptFile> : Tells sed to add the commands from <scriptFile> to the set of
commands to run. For example, i nstead of just replaci ng a and b, it you
want to uppercase all vowels i n the input, you can write an sed script file:



sed -f sed_script < oldFile > newFile

will uppercase all vowels.

Note that in sed script files, each command must be on a separate line. No
trailing white spaces can exist at the end of lines. No quotes can be used.



The n option
bash>cat sed_script
# sed comment - This script changes lower case
vowels to upper case
s/a/A/g
s/e/E/g
s/i/I/g
s/o/O/g
s/u/U/g
COE Unit 3, Lesson 2
161
-n : This option tells sed not to pri nt by default. Only when specific sed
commands for pri nt are used, those specific items will be printed. For
example,
sed n s/pattern/&/p file

will act like grep looking for pattern.




Self-Check Questions

1. sed is an interacti ve editor like vi (true/false)
2. sed can be used in scripts (true/false)




2.4 Commands in sed

Sed supports grep like regular expressions to fi nd the text for pattern
substitution and deletion. Sed uses vi like commands:

a appends text below the current line
i Insert text above the current li ne
c change text i n the current line with new text
s search and replace text
d Delete text
p Prints text

For example, if there is a file that lists tasks like:








To delete all li nes in a file that are marked DONE, you can use








bash>cat tasks
DONE: functions
TODO: sed
TODO: awk
DONE: password change

sed /DONE/d tasks > new_tasks
bash>cat new_tasks
TODO: sed
TODO: awk

COE Unit 3, Lesson 2
162
2.4.1 Syntax of the commands in sed

The sed commands have the general form as listed below:

[address][,address][!]operation [arguments]

Sed commands consist of addresses and operation. Each operati on consists of a
single letter.

Lets take the following i nput file for the examples gi ven below:









1. If no address is specified, the operation is applied to each line. For example:










2. Only the first pattern is matched by default. For example,










The second the is not modified.

To tell sed to work on all the matched patterns on a li ne, use g.






bash>cat input_file
This is the first li ne
This is the second line of text
This is the third line of i nput_file
This is the fourth and the last line

sed s/This/this/g < i nput_file > output_file
bash>cat output_file
this is the first line
this is the second line of text
this is the third li ne of input_file
this is the fourth and the last li ne

sed s/the/a/ < input_file > output_file
bash>cat output_file
This is a first line
This is a second line of text
This is a third li ne of input_file
This is a fourth and the last li ne

sed s/the/a/g < i nput_file > output_file
bash>cat output_file
This is a first line
This is a second line of text
This is a third li ne of input_file
This is a fourth and a last line
COE Unit 3, Lesson 2
163
The second the is also modified now.

3. Only one address can be gi ven. For example:










4. Two addresses can be given to make a block. For example:










5. $ can be used to denote end of file i n specifying addresses For example:








6. Address can also be gi ven usi ng patterns. For example:









7. Address can also be i nverted.
sed /SAVE/!d

this will delete all li nes that do not have SAVE on them

sed /BEGIN/,/MID/s/error/error/g

sed 2s/second/SECOND/g < i nput_fi le >
output_file
bash>cat output_file
This is the first li ne
This is the SECOND line of text
This is the third line of i nput_file
This is the fourth and the last line

sed 1,2s/li ne/input/g < input_file > output_file
bash>cat output_file
This is the first input
This is the second i nput of text
This is the third line of i nput_file
This is the fourth and the last line

sed 3,$d < input_file > output_file
bash>cat output_file
This is the first input
This is the second i nput of text

sed /input_file/d < i nput_file > output_file
bash>cat output_file
This is the first li ne
This is the second line of text
This is the fourth and the last line

COE Unit 3, Lesson 2
164
this will replace erorr by error from BEGIN to MID.

sed /^BEGIN/,/^END/!s/done//g

will delete the word done for all lines except for those lines between BEGIN
and END.

Address and patterns can include grep like regular expressions as well. For
example:








Self-Check Questions

3. What argument can be used to tell sed to apply operations on all the matched
patterns on a li ne:
a. none. Sed already does that by default.
b. g
c. i
4. What character can be used to i nvert the address i n sed?
a. none
b. i
c. x



2.5 Summary

The sed stream editor is a powerful command line tool, which can handle
streams of data: it can take input li nes from a pipe. This makes it fit for
noni nteractive use. The sed editor uses vilike commands and accepts
regular expressions. The sed tool can read commands from the command li ne
or from a script. It is often used to perform findandreplace actions on lines
contai ning a pattern.


2.6 Answers to the self check questions

1. false.
2. true.
3. (b)
4. (b)


sed /This.*first/p i nput_file > output_fi le
bash>cat output_file
This is the first li ne

COE Unit 3, Lesson 2
165
2.7 Terminal Questions

1. Use sed to implement a head like uti lity of unix (pri nts only first 5 lines).
2. Use sed to implement tail like utility of uni x (prints only the last 5 lines).
3. Print a list of files in your scripts directory, endi ng in ".sh". Mind that you might
have to unalias ls. Put the result i n a temporary file.
4. Make a list of files in /usr/bi n that have the letter "a" as the second character.
Put the result in a temporary file.
5. Delete the first 3 lines of each temporary file.
6. Print to standard output only the lines containi ng the pattern "an".
7. Create a file holdi ng sed commands to perform the previous two tasks. Add
an extra command to this file that adds a stri ng like "*** This might have
somethi ng to do with man and man pages ***" in the line preceding every
occurrence of the string "man". Check the results.
8. A long listing of the root directory, /, is used for input. Create a file holdi ng sed
commands that check for symbolic links and plain fi les. If a file is a symbolic
link, precede it with a li ne like "This is a symlink". If the file is a plain file,
add a stri ng on the same li ne, addi ng a comment like "< this is a plain file".
9. Create a script that shows li nes contai ning trailing white spaces from a file.
This script should use a sed script and show sensible information to the user
10. Can sed be used to create tail f ki nd of utility?
11. Search the internet to find how newli ne can be replaced.
12. Top 4 li nes of a file contain names of students and rest 4 lines contain their
marks :
bash>cat file
Mohit verma
Sushobhit sinha
Mukul Khan
Naina Suman
20
25
35
28
Using sed and paste command, create another file that will have
Mohit verma 20
Sushobhit sinha 25
Mukul Khan 35
Naina Suman 28



COE Unit 3, Lesson 3

LESSON 3 AWK BASICS

3. AWK BASICS ................................................................................................................ 169
3.0 OBJECTIVES .......................................................................................................... 169
3.1 INTRODUCTION AND BRIEF HISTORY ..................................................................... 169
3.2 THE SYNTAX OF AWK ........................................................................................... 169
3.3 USING AWK .......................................................................................................... 170
3.3.1 The print command in AWK.......................................................................... 171
3.3.2 Accessing fields on a line.............................................................................. 172
3.4 SUMMARY .............................................................................................................. 174
3.5 ANSWERS TO THE SELF CHECK QUESTIONS ......................................................... 174
3.6 TERMINAL QUESTIONS .......................................................................................... 174



COE Unit 3, Lesson 3
169
3. AWK Basics




AWK is a utility for performi ng simple text-processi ng tasks. Awk also
provides a small but powerful language that allows the user to manipulate fi les
contai ning columns of data and strings, to pri nt reports from the data.



3.0 Objectives

After goi ng through this lesson you will know
What is AWK, the syntax of AWK
How is AWK useful
Print command i n AWK
How to access fields i n AWK


3.1 Introduction and brief history

AWK stands for the names of its authors: "Aho, Weinberger, &
Kernighan".

The original version of AWK was developed i n 1977. In Unix it is
available as awk. Advanced versions exist (e.g, nawk, gawk) that
support user defined functions, multidimensional arrays, ?: operator,
deleting elements in an array, etc.

Awk operates i n a cycle: get a line, process it, get the next line,
process it, and so on. It is an "interpreted" language -- that is, an Awk
program cannot run on its own, it must be executed by the Awk uti lity
itself.

Like sed, AWK reads an i nput file or reads from a pipe. It does not
modify the i nput file and writes its output onto the standard output. In
addition, because AWK is a programmi ng language i n itself, awk is
very useful in processing data and printing reports.


3.2 The syntax of AWK




'
awk [options]
[ BEGIN {<initiali zations>} ]
[ <program> ]
[ [ <program>] ]
...
[ END {<fi nal actions>} ]
' <File Name>

COE Unit 3, Lesson 3
170

Where each <program> has the format:

[ <search pattern 1> ] [ {<program actions>} ]

Awk operates as listed below:
1. Perform i nitialization if BEGIN is gi ven
2. Read a line of text, break it into fields
3. For each <program>
4. Perform the program as gi ven by user
5. Goto step2.
6. Perform END calculations if specified by the user

The optional BEGIN clause performs any i nitializations required before
Awk starts scanning the i nput file.

The subsequent body of the Awk program consists of a series of
search patterns, each with its own program action. Awk scans each
line of the input fi le for each search pattern, and performs the
appropriate actions for each string found.

Once the file has been scanned, an optional END clause can be used
to perform any fi nal actions required.


3.3 Using AWK

We will use the following example data to see how to use awk. This
data is a fi le containi ng the top marks for some of the subjects along
with the topper names and years.

















Example 1: Since almost all of the awk syntax is optional, at the
minimum, the simplest awk command can be written as
awk input_file
bash>cat toppers.txt
Physics 92 2003 Abhay Malhotra
Chemistry 97 2003 Suman Gupta
Maths 99 2003 Suresh Yadav
Physics 94.5 2004 Shriesh Jadhav
Chemistry 98.5 2004 Shriesh Jadhav
Maths 96 2004 Lokesh Arora
Physics 89 2005 Vandana Agarwal
Chemistry 92 2005 Srini vas Vardharajan
Maths 99 2005 Anup Mathur
Physics 98 2006 Ramakant
Chemistry 88 2006 Raju Pandy
Chemistry 89 2006 Rajni Kumar
Maths 98 2006 Javed M. K. Akthar


COE Unit 3, Lesson 3
171

This will work like the cat command and print the entire i nput_file as is.

Note that here we are runni ng an AWK code usi ng awk command. The
code is always kept within quotes.

Example 2: You can ask awk to work on specific lines. For example,
you can give a search pattern.

awk '/Physics/' toppers.txt > phy_toppers.txt

Note that AWK does not modify the i nput.
Also note that AWK writes output to the standard output.

Here we have redirected the output i nto a file phy_toppers.txt. Now
lets see the contents of this output fi le:









Example 3: Pattern matchi ng is based on case. For example, here if
you gave physics in place of Physics here as a search pattern, it
would not match the li nes contai ning Physics.

awk '/physics/' toppers.txt > no_match.txt

The fi le no_match.txt will come out an empty file.



Self-Check Questions

1. Awk is useful for processing text contai ning columns of data. (true/false).
2. Awk is a small programming language in itself (true/false).
3. Awk does not modify the input fi le (true/false).
4. Awk program cannot run on its own but needs which one to run:
(a) awk (b) sed (c) grep



3.3.1 The print command in AWK

A simple pri nt command is available in AWK. This command does not need
any format specifications and values can be pri nted i n a simple way.

bash>cat phy_toppers.txt
Physics 92 2003 Abhay Malhotra
Physics 94.5 2004 Shriesh Jadhav
Physics 89 2005 Vandana Agarwal
Physics 98 2006 Ramakant

COE Unit 3, Lesson 3
172
Example 4: If you use print with no arguments, it pri nts the i nput text as
is.















Example 5: You can give arguments to print









Note an important point here. The pri nt command pri nts the arguments
as is, so if you need any text like spaces, you will need to add that in
the print command itself as shown above. We will see more concrete
examples of print command i n subsequent examples.

3.3.2 Accessing fields on a line

The power of AWK lies i n the fact that it treats each i nput li ne as a
record consisting of fields. Which means, as it reads li nes, it breaks up
the li ne i nto fields and lets you access and manipulate fields and the
output.

By default AWK uses spaces as the separator for fields which means
when it reads a line, it breaks it up i nto words for you. The separator
can be changed easily as we will see later in this unit.

To access the fields of input li ne, awk provides the followi ng built in
variables: $0, $1, $2 $9. The first one, $0, gi ves you the entire li ne,
as is. $1 is the first field, $2 is the second field, .. and so on.

Example 6: If the input li ne just read in by awk is

Physics 92 2003 Abhay Malhotra
awk /Physics/ { print }
/Maths/ {print } toppers.txt

will print
Physics 92 2003 Abhay Malhotra
Maths 99 2003 Suresh Yadav
Physics 94.5 2004 Shriesh Jadhav
Maths 96 2004 Lokesh Arora
Physics 89 2005 Vandana Agarwal
Maths 99 2005 Anup Mathur
Physics 98 2006 Ramakant
Maths 98 2006 Javed M. K. Akthar

awk /Maths/ {print This is a math topper}
toppers.txt
will print
This is a math topper
This is a math topper
This is a math topper
This is a math topper

COE Unit 3, Lesson 3
173
then,
$1 will contai n Physics
$2 will contai n 92
$3 will contai n 2003
$4 will contai n Abhay
$5 will contai n Malhotra

Note that because awk is treati ng space as the separator, it breaks up
the name too into two separate fields.

Example 7: To print just the names of chemistry toppers, you can use
the following command:












Note that we have used $5 to $9, though the names that we got in the
output would have come even with $5 and $6 alone because it seems
from the output that names are occupyi ng only two fields. However, we
do have a longer name (Javed M. K. Akhtar) also i n the names which is
occupying 3 fields. Therefore we need to be aware of the data when
printi ng multiple fields. AWK does not have a way to say things like
print all fields from $5 onwards so we need to use additional fields.
However, if you simply want to print the entire line, then you do not
need to use these fields. For example,

Example 8: To print all data for math toppers, use the following










The examples so far were solvi ng thi ngs that can be solved by a combi nation
of grep, sed, cut etc., as well. However AWK is much more capable. We will
see the other features i n subsequent chapters.


awk '/Chemistry/ {pri nt $4, , $5, ,$6, ,$7, ,
$8);}' toppers.txt > chem_topper.txt
bash>cat chem_toppers.txt
Suman Gupta
Shriesh Jadhav
Srinivas Vardharajan
Raju Pandy
Rajni Kumar

awk '/Maths/ {print}' toppers.txt > math_toppers.txt
#Note no $1, $2 used
bash>cat math_toppers.txt
Maths 99 2003 Suresh Yadav
Maths 96 2004 Lokesh Arora
Maths 99 2005 Anup Mathur
Maths 98 2006 Javed M. K. Akthar

COE Unit 3, Lesson 3
174


Self-Check Questions

5. awk processes how many li ne(s) of input at a time?
(a)1, (b) 2, (c) depends on the available memory, (d) all li nes i n i nput.
6. awk breaks i nputs into columns or words (true/false)
7. awk uses spaces to break inputs (true/false)
8. The pri nt command can be used to pri nt the fields of input with added text
(true/false)




3.4 Summary

The awk utility is a powerful command li ne tool, which can handle
streams of data: it can take input li nes from a pipe. This makes it fit for
noni nteractive use.


3.5 Answers to self check questions

1. true.
2. true.
3. true.
4. (a)
5. (a)
6. true.
7. true.
8. true.


3.6 Terminal Questions

1. Take the toppers.txt of this chapter. For each year and subject, pri nt
the first name of the topper, marks and then year.
2. Do the same question as listed above but now pri nt the complete name
of the topper followed by marks and then year.
3. See the AWK syntax. We have used only one pattern and its program
in our examples. Try using multiple patterns and their corresponding
programs and see the outputs.


IT 102 Unit 3, Lesson 4

LESSON 4. AWK PROGRAMMING

4. AWK PROGRAMMING ................................................................................................ 177
4.0 OBJECTIVES .......................................................................................................... 177
4.1 INTRODUCTION ...................................................................................................... 177
4.2 RELATIONAL AND LOGIC OPERATORS IN AWK ..................................................... 177
4.3 CONTROL STRUCTURES IN AWK ............................................................................ 178
4.3.1 The if-else construct....................................................................................... 178
4.3.2 The for loop ..................................................................................................... 179
4.4 SPECIAL VARIABLES - NF AND NR........................................................................ 180
4.4.1 Using BEGIN and END clauses in awk ...................................................... 180
4.4.2 Using variables in AWK ................................................................................. 181
4.5 RUNNING AWK PROGRAMS KEPT IN FILES ........................................................... 182
4.6 GENERATING REPORTS USING AWK .................................................................... 184
4.6.1 The printf command of AWK ........................................................................ 184
4.6.2 Format specifications in printf....................................................................... 185
4.6.3 Printing the fields in different order than input ........................................... 186
4.6.4 Creating simple reports ................................................................................. 186
4.6.5 Field separator ................................................................................................ 187
4.6.6 Printing heading/heading row and summary/footer .................................. 189
4.7 MISCELLANEOUS FEATURES OF AWK .................................................................. 190
4.7.1 Specifying search patterns in AWK ............................................................. 190
4.7.2 Limiting the lines on which AWK would work............................................. 191
4.7.3 Built-in variables ............................................................................................. 192
4.7.4 Passing arguments to AWK.......................................................................... 193
4.7.5 Arrays and associative arrays in AWK........................................................ 195
4.7.6 String functions in AWK................................................................................. 195
4.7.7 Few interesting, complex examples ............................................................ 196
4.8 SUMMARY .............................................................................................................. 197
4.9 ANSWERS TO THE SELF CHECK QUESTIONS ......................................................... 197
4.10 TERMINAL QUESTIONS .......................................................................................... 197

COE Unit 5, Lesson 6
177
4. AWK Programming




In the previous chapter we saw how AWK can be used to process the input
data and print i n some ways as needed. In this chapter we will see
programming features of AWK that make it very powerful.




4.0 Objectives

After goi ng through this lesson you will know
How to use AWK programmi ng
Relational and logic operators for conditions
Control structures
Use of variables, BEGIN and END clauses
How to generate reports usi ng AWK
Miscellaneous features of AWK


4.1 Introduction

AWK provides a simple yet powerful programming language. The
programming language features are similar to C language constructs.

Note that we will conti nue to refer to the toppers.txt file from chapter 3 for
examples.


4.2 Relational and logic operators in AWK

AWK supports compari ng fields to create conditions. Relational operators,
that compare two values, are available in awk. For example, a condition like
$1 == 2006 can be used. We will see such usage i n subsequent examples
below.
Relational operators like the following are there








== Compares whether the values specified are
equal
!= Compares whether the values specified are
not equal
> Tells whether a value is greater than the
other.
>= Tells whether a value is greater than or equal
to other.
< Tells whether a value is less than the other.
<= Tells whether a value is less than or equal to
other

COE Unit 5, Lesson 6
178

Multiple relational conditions can be combi ned using logic operators. For
example $1 == 2006 && $2 != 98. This condition will be true only when first
field will be 2006 and second will not be equal to 98.

Logic operators like the following are there:






Note an important point here. The relational operators only evaluate to
true/false. Unlike C operators they do not return a value which could be
printed or used i n an expression. So, for example ($1 == 1) + ($2 == 0) wi ll
result in an error duri ng AWK run.

Examples in subsequent sections will show conditions that use relational and
logic operators.


4.3 Control structures in awk

AWK provides C like control structures as well to facilitate programming.
Control structures in AWK include the followi ng:

4.3.1 The if-else construct

The if-else construct of AWK has the followi ng syntax.

if (condition) statement [ else statement ]

Example 1: To print the first name of the chemistry topper for year 2006, we
can use









Note that there is no else i n the example above. The else part of if-else is
optional.

Example 2: Print whether Maths toppers had more than 97 marks.



&& implies logic and
|| implies logic or
! Implies logic negation

awk /Chemistry/ { if( $3 == 2006 ) print $5 }
toppers.txt

wi ll print
Raju

COE Unit 5, Lesson 6
179

























Note that there is an else part used in this example.

Also note that if there are more than one statements they can be clubbed
together with curly braces as we have done here in the example above.

4.3.2 The for loop

The for loop in AWK has the following syntax:

for(initial condition; termination condition; increment) statement;

Example 3: To print some text for each of the fields we can use











awk /Maths/ { if( $2 > 98 )
{
print In the year , $3;
print , $5, had more than 98 marks\n
}
else
{
print In the year , $3;
print , $5, had less than 98 marks\n
} } topper.txt
This will print
In the year 2003 Suresh had more than 98
marks.
In the year 2004 Lokesh had less than 98
marks.
In the year 2005 Anup had more than 98
marks.
In the year 2006 Javed had more than 98
marks.

awk /Maths/ { for(i=1; i<=4 ) print $i, :; }
toppers.txt

will print
Maths:99:2003:Suresh:
Maths:96:2004:Lokesh:
Maths:99:2005:Anup:
Maths:98:2006:Javed:


COE Unit 5, Lesson 6
180
Note that $0 contains the entire text i nput line and $1 onwards contai ns the
fields. Also note that we have used a variable i here. We will see details on
variables in AWK later.



Self-Check Questions

1. AWK programs can compare two fields of the input li ne. (true/false).
2. Relational operators give true or false but return value cannot be used i n
expressions (true/false).
3. The if-else construct of AWK mandates that the else part must be there
(true/false).
4. for loop can have a block of statements enclosed i n curly brackets (true/false).




4.4 Special variables - NF and NR

Awk provides internal special variables called

NF stands for the number of fields in the currently read li ne.
NR stands for the total number of records read.

Example 4: Printi ng only the long lines more than 5 fields:







Example 5: For Maths toppers, if we want to skip printing the year, we can
use the following AWK command:













4.4.1 Using BEGIN and END clauses in awk

awk {if (NF > 5) pri nt} toppers.txt

this will pri nt
maths 96 2006 Javed M. K.
Akthar

awk /Maths/ { for(i=1; i<= NR ) if( i != 2) print $i ;
pri nt \n;
} toppers.txt

will print
Maths 99 Suresh Yadav
Maths 96 Lokesh Arora
Maths 99 Anup Mathur
Maths 98 Javed M. K. Akhtar

COE Unit 5, Lesson 6
181
Usual programmi ng tasks consist of
Initiali zing some variables
Reading i nputs, performing some calculations and outputs
Finally, generati ng some output based on the complete i nput set.
The BEGIN clause of AWK lets you specify initiali zations. And, the END
clause lets you perform calculations based on the entire input.



Example 6: Suppose you want to print the total number of toppers.










4.4.2 Using variables in AWK

AWK provides $0, $1, $2, .. etc. as fields. In addition, you can use your own
variables as well for any calculations. You need not declare the variable.
Simply usi ng a variable is permitted.

Example 7: Suppose we want to find out the average top marks for physics
over the years.









In this example, "marks" is a user defi ned variable. You can use almost any
string of characters as a variable name i n AWK, as long as the name doesn't
conflict with some string that has a specific meaning to Awk, such as "pri nt" or
"NR" or "END".

There is no need to declare the variable, or to i nitiali ze it. A variable handled
as a stri ng variable is initiali zed to the "null stri ng", meani ng that if you try to
print it, nothing will be there. A variable handled as a numeric variable will be
initialized to zero.



Self-Check Questions
awk 'END {print There are NR," toppers"}'
toppers.txt

will print
There are 13 toppers.

awk '/physics/ {marks += $2}
END {print "The average top marks i n physics
are " marks/NR}' toppers.txt

This will print
The average top marks i n physics are 93.375.

COE Unit 5, Lesson 6
182

5. Special AWK variable NF stands for
(a) Next field, (b) New Format, (c) Number of fields, (d) Next Line
6. END is used i n AWK to
(a) Exit from AWK, (b) To do final calculations
7. You can use any variable i n AWK but you need to declare it first
(a) true, (b) false.






4.5 Running AWK programs kept in files

As you must have noticed, AWK programs can easily be longer than one li ne.
Typing long programs on command line is quite cumbersome. Moreover,
whenever you create programs, you would want to keep them i n files to be
able to use them over and over again.

AWK provides a way to run AWK programs. The commands can be written
into a file, and then AWK can be told to execute the commands from that file
as follows:

AWK -f <awk program file name>

Example 8: Suppose someone has a coin collection with gold and silver coins.
Details of this collection are listed below.














Now we can create an AWK program to print a summary of this coin collection
as shown below:






bash>cat coin_collection_details.txt

Coin type weight(gm) year of making
Gold 1 1945
Gold 1 1952
Silver 10 1948
Gold 1 1973
Gold 1 1973
Gold 0.5 1945
Gold 0.1 1933
Silver 1 1943
Gold 0.25 1921

COE Unit 5, Lesson 6
183




































Note that AWK programs allow you to put comments as well. See the first two
lines of show_coin_summary fi le listed above.

You can run this AWK program as shown below:










bash>cat show_coi n_summary
/gold/ { num_gold++; wt_gold += $2 } # Get
weight of gold.
/silver/ { num_silver++; wt_silver += $2 } # Get
weight of silver.
END { val_gold = 485 * wt_gold; #
Compute value of gold.
val_si lver = 16 * wt_silver; # Compute
value of silver.
total = val_gold + val_silver;
pri nt "Summary data for coin collection:"; #
Print results.
pri ntf ("\n");
pri ntf (" Gold pieces: %2d\n",
num_gold);
pri ntf (" Weight of gold pieces: %5.2f\n",
wt_gold);
pri ntf (" Value of gold pieces:
%7.2f\n",val_gold);
pri ntf ("\n");
pri ntf (" Silver pieces: %2d\n",
num_silver);
pri ntf (" Weight of si lver pieces: %5.2f\n",
wt_silver);
pri ntf (" Value of silver pieces:
%7.2f\n",val_silver);
pri ntf ("\n");
pri ntf (" Total number of pieces: %2d\n",
NR);
pri ntf (" Value of collection: %7.2f\n",
total); }

bash>awk f show_coi n_summary
coin_summary_details.txt

The Output of this run will be:
Gold pieces: 7
Weight of gold pieces: 4.85
Value of gold pieces: 2352.25

Silver pieces: 2
Weight of silver pieces: 11
Value of silver pieces: 176

Total number of pieces: 9
Value of collection: 2528.25

COE Unit 5, Lesson 6
184
4.6 Generating reports using AWK

AWK programs can be used to quickly process text inputs and create various
reports. Because AWK processes each record as fields, AWK is much more
helpful in creati ng reports, compared to other Uni x utilities, like sed.

4.6.1 The printf command of AWK

While print command is available i n AWK, print is quite a basic command.
Often more sophisticated formatti ng is needed, specially while generati ng
reports. For sophisticated output formatting, C like printf command is available
in AWK

Printf uses format specifications like %s, %d, etc. for formatti ng output.
%s pri nts string
%d pri nts a number in decimal format
%f prints a floating point number

In addition, you can use the followi ng as well to control spacing
\t to print a tab
\n to print a new line

Note that tabs come in very handy specially to pri nt well aligned columns. The
input text fields may vary in lengths. If you separate out fields with spaces, the
fields in output may not align well. Use tabs to get better aligned outputs.

Example 1: Printi ng the topper name and year for Maths, with spaces.









You can see that the output columns are not aligned.

Example 2: Printi ng the topper name and year for Maths, with tabs.










You can see that the output columns are well aligned now after usi ng tab.
awk /Maths/ {pri ntf(%s %s\n, $4, $3); }
toppers.txt
will print
Suresh 2003
Lokesh 2004
Anup 2005
Javed 2006

awk /Maths/ {pri ntf(%s\t%s\n, $4, $3); }
toppers.txt
will print
Suresh 2003
Lokesh 2004
Anup 2005
Javed 2006

COE Unit 5, Lesson 6
185

4.6.2 Format specifications in printf

The pri ntf command of AWK accepts many format specifiers. Moreover, for
each of the format specifier, you can control how the output will be printed.
This control specially helps further i n making the reports better readable.

The table below lists how values will be pri nted when certain format specifiers
are used:



































Self-Check Questions

8. If you use tabs in printf, the output will not be aligned (true/false)
9. Tab is pri nted by putti ng (a) \T, (b) \tab, (c) \t, (d) tab
10. For printing a string usi ng pri nt, a format specification is needed (true/false)
Format Value Results
%s Hello Hello
%10s Hello Hello
%-10s Hello Hello
-----------------------------------------
%c 100 "d"
%10c 100 " d"
%010c 100 "000000000d"
--------------------------------------------
%d 10 "10"
%10d 10 " 10"
%10.4d 10.123456789 " 0010"
%10.8d 10.123456789 " 00000010"
%.8d 10.123456789 "00000010"
%010d 10.123456789 "0000000010"
--------------------------------------------
%e 987.1234567890 "9.871235e+02"
%10.4e 987.1234567890 "9.8712e+02"
%10.8e 987.1234567890 "9.87123457e+02"
%f 987.1234567890 "987.123457"
%10.4f 987.1234567890 " 987.1235"

%010.4f 987.1234567890 "00987.1235"
%10.8f 987.1234567890 "987.12345679"
--------------------------------------------
%g 987.1234567890 "987.123"
%10g 987.1234567890 " 987.123"
%10.4g 987.1234567890 " 987.1"
%010.4g 987.1234567890 "00000987.1"
%.8g 987.1234567890 "987.12346"
COE Unit 5, Lesson 6
186
11. If you use a pri ntf with %10s and give worlds as argument to the pri ntf, the
output will come as 10worlds (true/false).



4.6.3 Printing the fields in different order than input

If you want to print some of the fields i n a order that is different from the input,
you can simply change the order of the $ variables i n the print commands.
This powerful feature is often useful when creati ng reports as well.

Example 3:










AWK features make it very useful to process data and pri nt reports, especially
when the data is arranged in columns like our toppers.txt example. Lets see a
few examples before looking at more AWK features.

4.6.4 Creating simple reports

Creation of simple reports is straightforward using AWK.

Example 4: If you want to pri nt the physics toppers for years prior to 2005, you
can use the followi ng command: (note year is the 3rd field i n i nput text):











Example 5: If you want to pri nt a simple yes/no answer whether the topper
had more than 92 marks or not, you can use the followi ng:





awk {if ($3 == 2006) print $3, , $1); } toppers.txt

will print the following
2006 95
2006 88
2006 89
2006 96

awk '/Physics/ {if ($3 < 2005) pri ntf(%s %s %s
%s\n, $3,$5,$6,$7,$8}' toppers.txt >
phy_toppers_before_2005.txt

bash>cat phy_toppers_before_2005.txt
2003 Abhay Malhotra
2004 Shriesh Jadhav

COE Unit 5, Lesson 6
187



















You can see how quickly awk can be used to generate reports like this.

Example 7: For Maths toppers, if we want to put a colon between fields except
in the names, we can use the following AWK command:
















Note that the special variable NF has been used to defi ne the terminating
condition. With the use of NF you can work with data havi ng variable number
of columns as well like we are able to print names that fit in 2 fields (e.g.,
Lokesh Arora) and names that need 4 fields (e.g., Javed M. K. Akhtar).

Also note that we have used if-else inside a for loop. The if-else part is
ensuri ng that there are no colons i n the names.

4.6.5 Field separator

awk {if ($2>92)
printf(%s\t%s\types\n, $3, $1)
else
pri ntf(%s\t%s\tno\n, $3, $1); } toppers.txt
> more_than_92.txt

bash>cat more_than_92.txt
2003 Physics no
2003 Chemistry yes
2003 Maths yes
2004 Physics yes
2004 Chemistry yes
2004 Maths yes
2005 Physics no
2005 Chemistry no
2005 Maths yes
and so on.

awk /Maths/ { for(i=1; i<= NF )
{
if( i < 4) pri ntf(%s: , $i);
else pri ntf($s , $i);
}
printf(\n);
} toppers.txt

will print
Maths:99:2003:Suresh Yadav
Maths:96:2004:Lokesh Arora
Maths:99:2005:Anup Mathur
Maths:98:2006:Javed M. K. Akhtar

COE Unit 5, Lesson 6
188
AWK works by reading one input record (one line) and breaking it up i nto
fields. By default, AWK uses white-spaces (space and tabs) as the field
separator. However, you may encounter tabular data that uses some other
characters as separator. For example, your input data may look like the output
of example 8.







Here colon (:) is the separator.

In such cases, you can tell AWK what character to use as field separator. The
field separator is an optional argument to the awk command.
awk -F<ch>








Example 8: If the input li ne is Maths:99:2005:Anup Mathur








Note that $4 here will contain the entire name itself because the separator has
been set as colon.

Example 9: You can pipe the output of one awk into another awk as well. So
we can pipe the output of the example 7 above i nto another AWK.
Maths:99:2003:Suresh Yadav
Maths:96:2004:Lokesh Arora
Maths:99:2005:Anup Mathur
Maths:98:2006:Javed M. K. Akhtar

e.g., awk -F: tells AWK to use colon as a
separator
awk -F' |' tells AWK to use bar as a
separator
awk -F' \"' tells AWK to use double quote as
a separator

And AWK is run with F: as an argument, the
$1 will contai n Maths
$2 will contai n 99
$3 will contai n 2005
$4 will contai n Anup Mathur

COE Unit 5, Lesson 6
189


















4.6.6 Printing heading/heading row and summary/footer

The BEGIN and END clauses can be used even to pri nt headings and
summary for reports, thus maki ng the report more readable and attracti ve.

Example 10: Here we will pri nt the physics toppers with headers and will pri nt
a summary at the end.




















This will print




awk { for(i=1; i<= NR )
{
if( i < 4) pri ntf(%s: , $i);
else pri ntf($s , $i);
}
printf(\n);
} toppers.txt | awk F: {pri ntf(%-18s\t%d\n,
$4, $3); }

will print
Suresh Yadav 2003
Lokesh Arora 2004
Anup Mathur 2005
Javed M. K. Akhtar 2006

awk BEGIN {
pri ntf(Physics toppers details:\n)
pri ntf(-----------------------------------------\n);
pri ntf(Year\tMarks\tName of the topper\n);
pri ntf(-----------------------------------------\n);
}
/Physics/ {
pri ntf(%d\t%d\t%s\n, $3, $2, $4); }
sum += $2
}
END {
printf(-----------------------------------------\n);
printf(Avg top marks in physics were %f \n,
sum/NR)
printf(-----------------------------------------\n);

} topper.txt

---------------------------------------------
Year Marks Name of the topper
---------------------------------------------
2003 92 Abhay
2004 94.5 Shiesh
2005 89 Vandana
2006 98 Ramakant
---------------------------------------------
Avg top marks in physics were 93.375
COE Unit 5, Lesson 6
190






Self-Check Questions

12. AWK always prints the fields i n the same order as they appear in the i nput
(true/false).
13. AWK can generate reports contai ning only the i nput fields. No other items can
be added. (true/false).
14. Filed separator i n AWK is fi xed and cannot be changed (true/false).




4.7 Miscellaneous features of AWK

4.7.1 Specifying search patterns in AWK

As we have seen i n several examples and i n AWK syntax, search patterns,
along with their respective programs can be used i n AWK. So far we have
used simple search patterns like the example below:

awk /Physics/ {print} toppers.txt

However, AWK supports much more sophisticated patterns also, as listed
below.

/The/ matches any lines containing The
So this will match lines containi ng There, These, Them too.
But this wi ll not match lines containi ng the, these, them, etc. because AWK
uses case sensiti ve matching.

/^The/ matches any lines begi nning with The.
So this will match lines which contain The, These, Them i n the beginning only.

/The$/ matches any lines endi ng with The

/The\$$/ matches any lines ending with The$

/[Tt][Hh][Ee]/ matches any li nes with THE, The, tHe, thE, etc.

/^[a-zA-Z][a-zA-Z0-9_]*$/ matches li nes contai ning only identifiers.

/(^India)|(^Pakistan)/ matches lines beginning with India or Pakistan

You can even use complex regular expressions in AWK. The regular
expressions can be created by usi ng the following characters:
COE Unit 5, Lesson 6
191

? matches zero or one occurrence of
character before it
+ matches one or more occurrences of
character before it
* matches zero or more occurrences of
character before it
. The dot matches any character

For example, the followi ng expression will match any line containi ng only a
signed integer. The matched line cannot contai n any other characters.
/^[+-]?[0-9]+$/ matches signed i ntegers.

Example 1: A data file contai ns some text and some integer numbers. Here is
the data file:






















4.7.2 Limiting the lines on which AWK would work

By default, awk works on each of the lines of i nput. We have already seen
that we can use search patterns to limit the li nes on which AWK would work.
In addition, you can limit AWK to work only on some block of i nput li nes.

/^India/,/^Pakistan/ will operate on lines starting with India and wi ll end
operation with the li ne starti ng at Pakistan.

NR == 15 will operate only on the 15'th line!
NR==10,NR==25 will operate on lines 10 to 25.
$1 == "India" will operate on li nes where the first field is "India"
$1 ~ /India/ will operate on li nes where the first field contai ns India.
bash>cat data_file.txt
The number of loans gi ven
12399
The number of loans fully repaid by now
2893
The number of defaulters
129
Defaulted amount (loss)
-8929972
Loss after adjusting procedural expenses
-9288990.72

awk /^[+-]?[0-9]+$/ {print } data_file.txt
will print
12399
2893
129
-8929972

COE Unit 5, Lesson 6
192

You can even create complex conditions using &&, || operators
e.g.,
((NR >= 30) && ($1 == "India")) || ($1 == "Pakistan")

Example 2: If you know that your input data has some header text and some
footer text and the data of your i nterest lies in between, then you should use
such patterns to limit AWK to work only on data and not on the header and
footer.





















Self-Check Questions

15. AWK search patterns are case-i nsensitive. (true/false)
16. /NASA/ wi ll match only lines containi ng NASA. (true/false).
17. AWK will work on each line of i nput. There is no way to limit the scope.
(true/false)



4.7.3 Built-in variables

We have used many of the built-in variables of AWK, such as $0, $1, $2,.. etc.
and NF, NR. In addition, AWK has few other bui lt i n variables as listed below.

Note that these variables are not read-only. That means, during a AWK
programs run, the program itself can change the value of the variable!

FS : Field separator. By default AWK uses spaces as field separator and we
have seen the F option that can be used on the command li ne to specify the
bash>cat data.txt
-------------------------------------------------
The weather report for 24.05.2007
-------------------------------------------------
City Humidity Max Temp
Agra 92 38
Delhi 93 39
Mumbai 98 34
Copyright CNN world
Data from 2pm IST

awk NR > 3 && NR < 8 {printf (%s\tTemp=%d\n,
$1, $3); } data.txt
will print
Agra Temp=38
Delhi Temp=39
Mumbai Temp=34

COE Unit 5, Lesson 6
193
field separator to be used by AWK. In addition, AWK has a built in variable FS
that specifies the field separator.
RS : Record separator. By default AWK reads each line as an i nput li ne which
means the default record separator is the new line. However, you can use RS
to change the record separator.
OFS: Stores the "output field separator", which separates the fields when Awk
prints them. The default is a "space" character.
ORS: Stores the "output record separator", which separates the output lines
when Awk prints them. The default is a "newline" character.
FILENAME: Contains the name of the current input file.

4.7.4 Passing arguments to AWK

So far we have seen AWK programs and commands where the values were
fixed. For example, consider example from chapter 4 where a fi xed value is
being used:

Example 3: Print whether Maths toppers had more than 98 marks.














This will print










Now, you may be asked to pri nt the same report but for 94 marks. In which
case, you will need to copy and modify the same script to replace 98 by 94.
Such copying must be avoided because (a) it creates multiple scripts doing
nearly the same things, (b) if you fi x some error in one fi le you will need to fi x
it in all the files of same type, (c) the operation of copying and modifying is
very error prone (what if the change from 98 to 94 is done in all places but
gets accidentally left out at one place). Therefore, it is safer to make your
awk /Maths/ { if( $2 > 98 )
{
print In the year , $3;
print , $4, had more than 98
marks\n
}
else
{
print In the year , $3;
print , $4, had less than 98 marks\n
} } topper.txt

In the year 2003 Suresh had more than 98
marks.
In the year 2004 Lokesh had less than 98
marks.
In the year 2005 Anup had more than 98
marks.
In the year 2006 Javed had more than 98
marks.

COE Unit 5, Lesson 6
194
scripts in a generic way. Consider the example 3 again but made generali zed
as example 4 below:

Example 4: Print whether Maths toppers had more than N marks.

COE Unit 5, Lesson 6
195














It is i nvoked as

awk f report_script N=94 toppers.txt


Note that we are passing N=94 in the command line. So if another report is
needed to fi nd with N=55, we need not copy/modify the file but we can simply
pass N=55 on the command line itself.

4.7.5 Arrays and associative arrays in AWK

Any user defined variable can work as an array i n AWK. You can simply
assign values with i ndexing. For example,

Field[1] = $1
Field[3] = $3

AWK also supports associative arrays.

For example, if $i contai ns the name of city and $j contains the citys
temperature, you can store this i nformation in an associative array.

Temperature[ $i ] = $j;

4.7.6 String functions in AWK

If you place multiple strings side by side, they will be joined.




length() function returns the length of a gi ven string.

substring(str, startIndex, length) function takes out the substring.
substring("DTU", 5, 3) wi ll return "bag".

bash>cat report_script
if( $2 > N )
{
print In the year , $3;
printf( %s had more than %d marks\n, $5, N);
}
else
{
print In the year , $3;
printf( %s had less that %d marks\n, $5, N);
}

a = "DTU" "Delhi" # a will become "DTUDelhi".

COE Unit 5, Lesson 6
196
Note that i ndex starts from 1, not 0.

index(str, searchStr) gives the i ndex of the searchStr or 0.
index("DTU", "bag") will return 5.
index("DTU", "DEI") will return 0.

split(str, array [,separator]) splits an stri ng by separator and fills them i nto an
array.
split("mera bharat mahan", slogan) will put
slogan[1] as "mera"
slogan[2] = "bharat", etc.



Self-Check Questions

18. AWK provides a built in variable for field separator (true/false).
19. Built in variables are read only (true/false).
20. Variables passed to AWK are accessed as $1, $2, etc. (true/false)
21. AWK does not support complex structures but supports associative arrays
(true/false).



4.7.7 Few interesting, complex examples

Few interesti ng examples are listed below. These exemplify the power of
AWK.

Example 5: Counting non blank lines in a fi le:
awk 'NF != 0 {++count} END {print count}' input_file.txt

Example 6: Computing avg si ze of files in a directory

ls -l | awk 'NR!=1 {s+=$5} END {print "Average: " s/(NR-1)}'

Example 7: Print Fibonacci numbers:

awk 'BEGIN {a=1;b=1; while(++x<=10){print a; t=a;a=a+b;b=t}; exit}'

Example 8: Sometimes we may repeat words uni ntentionally like: "When I
was going there". Detecting these manually is difficult, But we can write an
AWK program to do this!!







BEGIN { dups=0; w="xy-zzy" }
{ for( n=1; n<=NF; n++)
{ if ( w == $n ) { print w, "::", $0 ; dups = 1 }
; w = $n }
}
END { if (dups == 0) pri nt "No duplicates found."}

COE Unit 5, Lesson 6
197


4.8 Summary

Awk is a very powerful utility in Uni x. It helps i n scripting and report
generation.


4.9 Answers to the self check questions

1 true
2 true
3 false
4 true
5 (c)
6 (b)
7 (b)
8 false
9 (c)
10 false
11 false
12 false
13 false
14 false
15 false
16 false
17 false
18 true
19 false
20 false
21 true


4.10 Terminal Questions

1. Take the toppers.txt of this chapter. For each year and subject, print the first
name of the topper, marks and then year.
2. Do the same question as listed above but now pri nt the complete name of the
topper followed by marks and then year.
3. Print the chemistry toppers marks, year and names for even years.
4. Print the years whenever the toppers scored >= 97 marks.
5. Input contains name and phone number records. To simplify, assume there is
only one name (first name) and only one phone number per name. Use
associative arrays to store numbers and names and at the end print them.
6. Upgrade example 8 to print the line number too where the repeated word i s
there.
COE Unit 5, Lesson 6
198
7. See the AWK syntax. We have used only one pattern and its program in our
examples. Try using multiple patterns and their corresponding programs and
see the outputs.
8. Generali ze the coins example of chapter 4 by passing the values of per gram
of gold and solver in place of hard coded values used in that example.

Você também pode gostar