Você está na página 1de 87
IndizaGandhi CS-02 Suelo et Saas Introduction Te Software oe UNIX Operating System-I UNIT 1 w Theoretical Concepts of UNIX Operating System UNIT 2 UNIX~Getting Started I. 1 UNIT 3 & UNIX-Getting Started I UNIT 4 z g ‘Text Manipulation UNIT 5 S$ Editors 7 BLOCK INTRODUCTION In the previous block we discussed several theoretical concepts 6f operating system in ‘general, which included file management, process management and memory Management. In this block (except the first unit which deals with the theoretical issues related to UNIX), we have presented in-depth study of UNIX operating systcm, its fils and Girectories, editors: ex and ed, Screen Editor Vi and Text manipulation commands, ‘The block contains five units: | UNIT 1 deals with how file management, process mangement, memory management is done in UNIX operating system. UNIT 2 éeseribes how to login, how to close the UNIX session and some basic concepts like hierarchical directory structure, ‘UNIT 3 discusses some simple UNIX commands and how to change the permission modes. It also introduces the concepis of filters and pipelines, UNIT deals with text manipulation commands which includes grep, emp, comm, diff ‘commands. It also describes how to arrange text files with cut, paste and sort and also how to split and transtate using tr, UNIT'S discusses various features of screen editor Vi and line editors ed and ex. Suggested Readings: 1, Andrew S. Tanenbaum. Operating System Design & Implementation, Prentice Hall Of India, 1990 2, Maurice J. Bach. The Design of The Unix Operating System, Prentice Hall Of India, 1989 3, Rachel Morgan & Henry McGilton, Introducing Unix System V, McGraw-Hill International Editions, 1987, IT1 THEORETICAL CONCEPTS OF UNIX OPERATING SYSTEM Structure 1.0 Inwoduction 1.1 Objectives 1.2 Basic Features Of UNIX Operating System 13 File Structure 14 CPU Scheduling 1.5. Memory Management 15.1 Swapping 152 Demand Paging 1.6 File System 16.1 Blocks And Fragments 1.62. Ines 1163 Directory Seoctue 1.7 Summary 1.8 Model Answers 1.0 INTRODUCTION In block 2 we discussed several theoretical concepts of operating system in general, itis often useful to see them in practice, In this block we have presented in-dopih sindy of UNIX shell programming editors, system administration etc. 1.1 OBJECTIVES After going through this unit you will be able to: + List the basic feanures of UNIX operating system + Describe UNIX file structure + Discuss CPU scheduling in UNIX system + Discuss memory management schemes in UNIX * Discuss file systems in UNIX operating system 1.2 BASIC FEATURES OF UNIX OPERATING SYSTEM + Ttis writtn in high-level language, °C’ making it easy to port to different configura- tions. Its a good operating system, especialy, for programs. UNIX programming environ- ‘mont is unusually rich and productive, It provides features thet allow complex programs to be built from simple programs. + Ttuses hierarchical file system that allows easy maintenance and efficient im- plementation, + Truses a consistent format for files, the byte stream, making application programs casi to write. + It is a multi-user, multiprocess system. Each user can execute soveral processes ultaneously. + Tehides the machine architecture from the user, making i easicr to write programs that run on different hardware implementation. UNIX Operating System-T ‘UNIX System Architecture ‘As do most computer systems, UNTX consists of two separable parts: the Kemel and System Programs. We can view the UNIX operating system as being layered as shown in figure 1. Shells editor and commands (who, we, grep, comp) complilers and enter pretens system Signals File system CPU scheduling, terminal handling swapping [Pages replacement, character YO block /O system | demand paging, system terminal visual memory Dévice controllers disks and terminals tapes. Physical memory ‘Figure 1: Unix System Architerture Everything below the system call interface and above the physical hardware is the Kernel. ‘The Kemel provides the file system, CPU Schedaling, memory management and other ‘operating system functions through system calls. Programs such as shell (Sh) and editors (vi) shown in the top layer interact with the Kemet by invoking a well defined set of system calls. The system calls instruct the Kernel to do various operations forthe calling programs and exchange data between the Kernel and the Program. ‘System calls for UNIX can be roughly grouped into three categories: file manipulation, process control and information manipulation. Another category can be considered for ‘device manipulation, but, since devices in UNIX are treated as (special) files, che same ‘system calls support both files and devices. —_— 1.3 FILE STRUCTURE a Afile in UNIX isa sequence of bytes, Different programs expect various levels of structure, Dut the Kemel does not impose any structure on files, and no meaning is attached to its ‘Contents - the meaning of the bytes depends solely on the programs that interpret the fle, This is not tru of just dsc flles but of peripheral devices as well. Magnetic tapes, mail ‘messages, character typed on the keyboard, line printer output, ata flowing in pipes - each Of these is just a sequence of bytes as far asthe system and the programs in it are concemed. Files aro organizad in wes-stractecd directorics, Dirccures are themselves files that contain information on how to find other files. A path name toa file isa text string that idemifies 8 file by specifying a path through the directory structure tothe file. Syntactically it contains ‘of individual file name elements separated by the slash character. For example, in Jusi/Akshay/data, the first slash indicates the root of the directory tee, called the root 15 MEMORY MANAGEMENT “Van Operating Son ‘The CPU scheduling is strongly influenced by memory management schemes. At least part | ‘of a process must be contained in primary memory to run; a process cannot be executed by a ‘CPU if itis existing entiely in main memory. It is not also possible to contain all active processes in the main memory. For example 4MB main memory will not be able to provide space for SMB process. Itis the job of memory management modiule to decide which process should reside (atleast partially) in main memory, and manage the parts of the virtual address of a process which are residing on secondary storage devices. It monitors the amount of physical memory and provide swapping of processes between physical memory and ‘ secondary storage devices, 1.5.1 Swap, ‘The early development of UNIX systems transferred entire processes between primary ‘memory and secondary storage device but did not transfer parts of a process independenly, xcept for shared text. Such a memory management policy is called swapping. UNIX was first implemented on PDP-11, where the total physical memory was limited to 256Kbytes. ‘The total memary resources were insufficient to justify or support complex memory ‘management algorithms. Thus, UNIX swapped entire process memory images. Allocation of both main memory and swap space is done first- fit. When the size of a process’ memory image increases (due to either stack expansion or data expansion), a new piece of memory big enough for the whole image is allocated, The memory image is copied, the old memory is freed, and the appropriate tables are updated. (An attempt is made in some systems to find memory contiguous to the end of the current piece, to avoid some copying.) ‘Eno single piece of main memory is large enouet., the process is swapped out such that it will be swapped back in with the new size, ‘There is no nced to swap out a sharable text segment, because it is read-only, and there is no ‘need to read in a sharable text segment for a process when another instance is already in ‘memory. That is one of the main reasons for keeping track of sharable text segments: less ‘swap traffic. The other reason is the reduced amount of main memory required for multiple ‘Processes using the same text segment, Decisions regarding which processes to swap in or swap out are made by the scheduler process (also known as the swapper). The scheduler wakes up at Ieast once every 4 seconds to check for processes to be swapped in or out. A process is mare likely to be ‘swapped out if it is idle or has been in main memory for a long time, or is large; if no ‘obvious candidates are found, other processes are picked by age. A process is more likely to be swapped inif its has been swapped out a long time, or is small, There are checks to prevent thrashing, basically by not letting a process be swapped out if it’s not been in memory for a certain amount of time. [jobs do not need to be swapped out, the process table is searched for a process deserving to ‘be brought in (determined by how small the process is and how long it has been swapped oul). Processes are swapped out until there is not enough memory available, Many UNIX systems still use the swapping scheme just described. All Berkeley UNIX systems, on the other hand, depend primarily on paging for memory-contention management, ‘and depend only secondarily on swapping, A scheme similar in outline to the traditional one. isused to determine which processes get swapped in or out, but the dctals differ and the influence of swapping is less. 1.5.2. Demand Paging Berkeley introduced demand paging to UNIX with BSD (Berkeley System) which transferred memory pages instead of processes to and from a secondary device; recent releases of UNIX system also support demand paging. Demand paging is done in a straightforward manner. When @ process needs a page and the page is not there, a page fault to the kemel occurs, a frame of main memory is allocated, and then the process is loaded into the frame by the kernel, et ‘The advantage of demand paging policy is that it permits greater flexibility in mapping the” “'* virtual address of a process into the physical memory of a machine, usually allowing the size ‘of a process to be greater than the amount of availability of physi, memory and allowing ‘UNIX Operating Systemn-1 10 cee pmcesses to Bit into main memory. The advantage of swapping policy is thst it ia ‘casier to implement and results in less system overhead, L6_FILE SYSTEM ee ee Tins oN system supports two main objects: files and directories, Directoces are ust ‘les with aspéial format, so the representation of afilei te basic UNIX concept, 1.6.1 Blocks and Fragments Maatof te fl system is taken up by data blocks, which conn whatever the users have pt {in their files, Let us consider how these data blocks are stored on the disk. Trehardware disk sectors usually $12 bytes, A block sie larger than $12 bytes is desirable {Er speed. Howover, because UNIX fie systems usually contin avery large number of rail Glee: mach larger blocks would cause excessive intemal fragmentation, That s why the ‘earlier 4.1BSD file system was limited to a 1024-byte (IK) block. The 4238 solution sto use two Bosses for les which have no inde locks: al he blocks ofa file ae ofa ch as 8K), except the last. The last block is an Implementation details force a maximum block-to-fragm ‘block size of 4K, so typical choices are 4096 : 512 for the latter, Suppose data are written toa file in transfer sizes of 1K bytes, and the block and S2es of the file system are 4K and 512 byte. Ths filesystem will allocate a 1K fragsrent to Srerain the data from the fist transfer. The next ransfer wil cause anow 2K fragment to be fragment must be copied into this new fragment, followed by the socond 1K transfer. The allocation routines do atte to fad the requied Pee rane isk immediately following the existing ferment so that no copying is necessary, put. if they cannot do 30, up to seven copies may be required before the fragment besomene block. Provisions have been made for programs to discover the block size fora fle thar transfers ofthat size can be made, to avoid fragment recopying. 1.6.2 Inodes Associated with each file in UNIX is a little table (on disk) called node. An inode isa record that describes the attributes of a file, including the lay out of its data on disk. Inodes Gxt ina static form on disk and the kere! read them ino the main memory and manipula them, Disk inodes consist of the following fields: * File owner identifier - File ‘ownership is divided between an individual owner anda sr0up owner and defines the set of users who have access rights to a fle. There an, Pervisor has access rights to all files in the. system, * Filetype - Files may be of type regular, directory, character or block special or pipes. * File access permission - The system protects files according to three classes: the Owner and the group owner ofthe file and other users; cach class has access rights {Od write and execute the fle which can beset individually. Although directory Xa file bt it cannot be executed, execution permission for a directory gives the right to search the directory, for a file name, * File access times - Giving the time the fle was last modified, when it was lat ao- cessed, {naddition, the inode contains 15 pointers to the disk blocks containing the data contents of the file. The first 12 of these pointers (as shown in figure 3) point o direct blocks; that i, ‘they contain addresses of blocks that contain data of node Deve Broek Dise Block No. 2 Dise Block No. 3 ite Block No, 4 ise Block Ne. ise Block No. 6 ite Block No. 7 ise Block No. & Dise Block No. 9 Dise Block No. 10 Dise Block No. 11 Double indie Triple indirest Eeop a blocker f Pointer wo inn blocks Figure 3 : Direct Indirect block of inode the file. Thus, the data for smalt files (no more than 12 blocks) can be referenced immediately, because a copy of the inode is kept in main memory while a file is open. Ifthe lock size is 4K, then up 10 48K of data may be accessed directly from the inode. ‘The next three pointers in the inode point to indirect blocks, If the file is large enough to use ‘indirect blocks, the indirect blocks are each of the major block size; fragment size applies ‘10 only data blocks. The first indirect block pointer is the address of a single indirect block. ‘The single indirect block is an index biock, containing not data, but rather the addresses of ‘blocks that do contain data. Then, there is a double-indirect-block pointer, the address of a ‘block that contains the addresses of blocks that contain pointers to the actual data blocks. ‘The lst pointer would contain the address of a triple indirect block; however, there is no need for it, The minimum block size for a file system in 4.2BSD is 4K, so files wit as many as 2" bytes will use only double, not triple, indirection. That is, as each block pointer takes 4 bytes, we have 49,152 (4K x 12) bytes accessible in direct blocks, 4,194,304 bytes ‘accessible by a single indirection, and 4,294,967,296 bytes reachable double indirection, for a total of 4,299,210,752 bytes, which is larger than 2°7 bytes, The number 2°? is significant because the file offset in the file structure in main memory is kept in a 32-bit ‘word, Files therefore cannot be larger than 2 bytes. Since file pokmters are | integers (for seeking backward and forward in a file), the actual maximum filesize is 2" bytes. ‘Two gigabytes is large enough for most purposes. 163 Before a file can be read, it must be opened. When a file is opened, the operating system uses {he path name supplied by the user to locate the disk blocks, o that it can ead and write the {ile later. Mapping path names onto j-nodes (or the equivalent) brings us to the subject of tow directory systems are organized. These vary from quite simple to reasonably sophisticated, ‘Now let us consider some examples of systems with hierarchical directory trees, Figure 4 shows an MS-DOS directory entry. It is 32 bytes long and contains the file name and the first block number, among other items. The fist block number can be used as an index into the u 2 al ‘UNIX Operating System 1 FAT, to find the second block number, and 80 on. In this way all the blocks can be found for 1 given file. Except for the root directory, which is fixed size(112 entries fora 360K floppy disk), MS-DOS directories are files and may contain an arbitrary number of entries, Bytee © Fite mame Figure 4: The MS-DOS directory entry ‘The directory structure used in UNIX is extremely simple, as shown in figure 5. Bach entry contains just a file name and its i-node number. All the information about the type, size, ‘times, ownership, and disk blocks is contained in the i node (see figure 3). All directories in ‘UNIX are files, and may contain arbitrarily many of these entries. Byes 2 14 a. rnold umber Figure $1 A Unlx dlrectory entry ‘When a files opened, te filesystem must take the fle name supplied and locate its disk blocks. Let us consider how the peth name /ustfasvinbox is looked up. We will use UNIX as an example, but the algorithm is basically the same for all hierarchical directory systems. First the file system locates the root directory. In UNIX its i-node is located ata fixed place on the disk. ‘Then it looks up the first component of the path, usr, in the root directory to find the i-node ‘ofthe file (us. From this i-node, the system locates the directory for /usr and looks up the ‘next component, ast, in it. when it has found the entry for ast, it has the i-node for directory ‘/usrfast. From this i-node it can find the directory itself and look up mbox. The i- node for ‘this file is then read into memory and kept there until the file is closed. The lookup process is ilustrated in figure 6, Block 132 Inode 26 T-node 6 isfusr is for Root directory is forfuar directory fust/ast 4 mode 1 size 4) bin times 7] dev 132 4] 9] ee 6] ve Ls] me] Looking up T-node 6 Tnode 26 _/usr/ast/mbox gar yine says. that says that is inode inode 6 fase is in fustfast 60 block 132 block 406 ‘igure 6: The steps la ooking up /usshabor Relative path names are looked up the same way as absolute ones, only starting from the working directory instead of starting from the root directory. Every directory has entries for. ‘and ...which are put there when the directory is created. The entry . has the i-node number for e the current directory, and the entry for .. has the i-node number for the parent directory, and searches that directory for disk, No special mechanism is needed to handle these names. As {far asthe directory system is concemed, they are just ordinary ASCII strings. 1,7 SUMMARY In this unit we discussed issues broadly related to CPU scheduling, memory management schemes and file systems of UNIX operating system. We did not go into implementation details ofthese schemes as well s system calls in detail. Students are strongly advised to refer toa book “ The Design of UNIX Operating System "by Maurice 1. Back, fr detailed discussion. The main points covered in this unit are: + UNDX provides a good programming environment, It supports features that allow ‘complex programs to be built from simpler programs. + The UNIX file i simply a sequence of bytes without any meaning imposed on i ‘meaning is mainly dependent upon programs that interpret it + Kemel allocates the CPU to a process for small fraction of time, preempts the [Process that exceeds its time slice and feed it back into one of several priority ‘queues. One substantial difference betwoen UNIX and many other systems is the ‘ease with which multiple processes can be created and manipulated. + Memory management in UNIX is swapping supported by paging. + The UNIX file system supports two main objects files and directories. Directories are just files with a special format so the representative ofa file isthe basic UNIX concept. 1.8 MODEL ANSWERS Check Your Progress 1. + UNDX is written in high level language, C. This makes the system highly portable. + UNIX supports good programming environment. It atlows complex programming to bbe built from smaller programs, ‘uses a hierarchical file system that allows easy maintenance and effi implementation, 2. UNIX system uses a hierarchical file system structure. Any file may be located by tracing @ path from the root directory. Each supported device is associated with one or ‘more special files. Input/Outpat to special files is done in the same manner as with ordinary disk files, but these requests cause activation of te associated devices. Check Your Progress 1 1, Can more than one person use the same uscr account on a UNIX systomn? 2, Can there be more than one account with the same name on a UNIX system? 3, Can more than one user account have the same password? ‘Thoorefcal Concepts of Unix Operating Systm 1B UNIT 2_UNIX-GETTING STARTED I EDT Structure 2.0 Intoduction 21 Objectives 22 Getting Started 22.1. Uae Manne And Groupe 222 Lonsiog ia 223 Cormoting Typing Minakes - 2.24 Format of UNDX Commands z 22.5 Changing Your Pateword 2.26 Characters with Special Meaning 227 UNIKDacomentation 2.3. Files and Directories 23.1 Caveat Diectony 232 Looking At The Dizectay Conects 233 Abaclate and Relative Paths 2234 Some UNDK Direc and Fes 24 Summary 25° Model Answers 2.0 INTRODUCTION oO OO 2.1_ OBJECTIVES —_————S ‘After going through this unit you shall be able : * Toleam how to start a login session under UNIX. * Tolearn some basic concepts like the hierarchical directory structure, ‘+ Toteam about the various types of files under UNIX. * To learn how to close a login session. 2.2 GETTING STARTED OO Ye will now leam how to start using a UNIX computer This unit will talk about the asic 2 Involved in signing on toa sytem running UNIX and also what you ea do ce soa ‘block is of necessity very briof and can only serve ns an introduction, Use any other mascrial [UNIK - Getting Start t to which you haye acoess and experiment to your heart's content. You wil learn as much from your mistakes and from seeing unexpected outcomes as from things you do by the. book. 2.2.4 User Names and Groups Every UNIX user is given a name when he is allowed acoeas to a UNIX system. This is also called an account, as in commercial arrangements an account is kopt ofthe usage ofthe ‘machine by each user. The user name need not have any relation to the actual nanie of the user, though it quite often is some abbreviation of the name, For example a person called ‘Ram Kumar would usually be given a user name kumar on a UNIX system. This is formed bby his surname (abbreviated i is too long) and the frst outer of is first name, It is quite possible or aperson to have more than one account ana single machine (in a different ‘ame) especially f the person uses the maching In more than one capacity. For example Ram Komar might be working on two programming projects, in both of which be is par. team. On his cryptography project he might have a name erypid2 and on his natural Language processing project he might hayo a name like ntp04, The reason for this kind of has to do with his access rights and privileges in hig different prajects. Also if Ram Kumar leaves the company his successar Zafar Khan mighi continue his work on nip04 while ‘somebody else is assigned to work on erypi02. One of ibe things that motivated the designers ‘of UNIX was their desire for easy sharing of information, consistent with the neods of | security and privacy. So UNI allows user names 10 be grouped together under a common ‘gYoup name, All users belonging to the same group oan share group privileges, ‘Some user names are reserved by UNEX for its use, for example bin aad wucp, So you cannot use these names for yourself. There is also a special kind of usor on every UNIX systom who has all possible access rights on the system. This user ig callod the super user, the system ‘administrator or simply foot because that is the wsor nama convontionally alloticd to him, For ‘administrative convenience large sysloms can have more than one super uscr account, The Super user is tho ono who can creale new user ascounts, shutdown the system ad perform ‘other maintenance tasks. ‘You might be wondering why everybody cannot access the compuler as root, The reason is, that when you are granted ascess to a computor systom you aro assigned a user name as well asa password. You'can set your password ta whatever you want subject to certain ¢ 6 constraints. So you cannot enter the computer a8 1001 unless you know the root: ‘password. ‘The root password is zealously protected on any well mainiained installation as public knowlcdge of this password would compromise the security of the installation, While root can access all user files and override any system protection meant for mere ‘mortals, nobody can figure out what your password js. However root can change or remove your password. (Check Your Progress 1 |. Cam more then one person wis the samp ase: accoum exe UINIE ayeioi? "5" Gen there fe more thon one accous with Ge same name on a UNIX system? 3. Canmore som one sor account hve the same password? { 2.2.2 Logging in ‘You will now learn how to gain access to a UNIX sysiom so that youcan use its facilities. This process is called logging in to the computer. Ta bo able to login to-a machine you must | have a valid user account on it and you must, know your: ‘password, Your account would have | ‘been created for you by the system administrator when you were allowed to use the iS ‘UNIX Operating Syetem-I computer. At tha time you would also have been told your first password, When you see ‘your terminal it would be displaying a message like IGNOU UNIX computer login: ‘The actual message on the first line depends cn the installation, This could even be absent. ‘The message docs not affect anything else you do in any way, ‘You should now type in your user name and press the RETURN key. In most cases you have. ‘to press the return key for the computer to register what you have typed. This key is sometimes labelled as ENTER. You will find that 2s you type on the terminal screen you will be able to sce whatever you have typed. This is because UNIX usually echoes whatever you type on the terminal. So your screen should now look like this IGNOU UNIX computer | login: kamarr Password: ‘You must type in your user name, also called the login name, exactly as allocated by the ‘super user. Ths is because UNIX is casc sensitive, tha is, it distinguishes between lower case and upper case letters. In this respect it differs from operating systems like VMS, So be careful of small and capital letters while working on UNIX. ice that your password is not . This is to prevent somebody from reading your password over you shoulder, as that would enable that person to masquerade as you by logging into the computer in your name and using it ONIX now checks whether you area valid user and whether you entered the correct password, If theze is any mistake you get a message saying Login incorrect login: e° ‘This means you can try to login again, There can be other reasons why you might not be able to login even though you are g valid ser and did not make any typing mistakes. Tho ‘messages you get in those situations will however be different. ‘Why be so pessimistic? Let us assume you have managed 10 login successfully. The system will then display some messages and finally give you a sign that it is now ready to obey your ‘commands, The messages you sve depend on how the system has been configured or set up by the system sdministraror and by you. So you might not even see any messages, However usually there is a message indicating when you logged in las. Ths is useful because ifthe ate and time mentioned there ae different from what you remember about your last login, it could mean that somebody else is using your account, Let us now look at some of the other common types of messages you see on most systems as ‘you login. These usually give some information about the systom like the space available on the machine, news about the sysiem and whether you have any mail. The news is called the message ofthe day and appears whenever yau login. The message ‘You have mail, :means someone has sent yy mall using the user to user communication facilities available in UNIX (see unit 4). After the login messages you see a prompt, which isthe sign phat UNIX is ready for your ‘commands, The prompt can be changed to whatever you like but the default prompt also depends on what shell you have been essigned. One of the mast common shells is the C shell, which has the following prompt by default % This isthe prompt we will jee throughout the block unless some other prompt is explicitly called for. When you see the prompt on your terminal it normally means that UNIX has finished executing the last command you gave it and is ready for your next command, 16 ‘On some UNIX installations there is limit on the number of atempts, say five, you can [UNIX - Getting Started T ‘make at logging in, The action taken depends on the installation but can be alerting the system administrator or deactivating tho terminal, perhaps fora short time only. So you should be careful not to make too many typing mistakes. In particular be careful not to forget ‘or mistype your password and avoid passwords with certain characters like #. Check Your rrugtess « “1. What happens if you type in your login name all in upper case? 2, Try logging in by using a friend’s account (do not ask him for his password), Are you able to get the prompt? 43, Try logging in using an account which does not exist on your system (confirm this from your system administrator). Is there any difference in the computer's response from that ‘in the last exercise. Why do you think this is so? 4, Find out why you might fail to login even though you did nothing wrong while trying to login. Try to get at least five reasons, 2.2.3 Correcting Typing Mistakes ‘Many of us are not professional typists and we make a lot of mistakes while typing. In any case all of us are human beings and are prone to error. Whether you are a one or two finger expert or know touch typewriting, you are going to mistype your commands some time or the other, What do you do when you want to find out in a session whether you have any sundays left in the month? Normally you would use the cal command thus %cal ‘Suippose now that by mistake you type est ‘fier you press the return key UNIX will say est Command not found if you are lucky and a command cst does not exist. If it does it will be executed and you ‘could welt be in deep trouble depending on what esl does. ‘You would therefore do better to cance! your command or comrect your mistake. These actions can be accomplished by using the kill and erase characters respectively, The kill character cancels the entire line you typed while the erase character erases or rubs out the last character. uv ‘UNIX Operating System-t B (On most UNIX installations you can use the backspace key to erase the previous character, ‘On some terminals the erase characteris NH. This means that you have to type H while holding down the CONTROL key. This key is usually marked CTRL or CTL, It is usually ‘located on both sides of the keyboard near the Shift keys. In this block we wil write MH to ‘mean CONTROL-H and you must be careful not to confuse this with the two separate Characters circumflex) and H., Every time you press the erase key the cursor moves back one character after deleting the {ast charactor. So to correct your mistake eal ‘you should press the erase character twice so that you sce eo ‘and then retype ‘a’ and "I correctly. You can then press the ENTER key to run the cal ‘command. ‘The line kill character tells UNIX to kill the line, that is, to ignore everything on the line, ‘You do not get the prompt after typing this character unless you press the RETURN key. The line kill character is usually @ but can be changed to something else. ‘There is a command called stty which enables you to see what the erase and line kill character are, It also allows you to change them if you wish, The command allows you to ‘examine and alter many other terminal seutings as well, but for the moment we will not ‘consider anything else other than erase and line kill, Just type in ‘say ‘and observe the output. It will, among other things, say something like crase AHKill @ ‘This means that your erise character is MH and your line kill character is @. ‘Suppose you want to change your kill character to AX. You can do this by running the command % stty Ki AK Now typing @ has no effect on the command you type other than putting an @ as part of Your command. It no longer kills your command line. You can similarly set your erase ‘character to # by saying Ge suy erase # Both the settings can be changed at one stroke by saying % stty kill @ erase AH Any changes you make using sity will remain in effect only forthe duration of your current login session or until you use the command to make more changes. Next time you login the ‘characters revert 10 whatever the system administrator has configured them. You might wonder what will happen if you set the erase character to something like’ in the Bourne shell which has the default prompt of a § sign. Try sh S stty erase 1 and now try running the cal command. You will not be able to convey cal to the computer because "I"is taken as an instruction to erase the previous character. Itis for this reason that the erase and line kil characters aro usually not set to any characters you use commonly az ‘part of commands, Thus itis better tiot to set them to letters, digits or hyphens or even other commonly used special characters. However if you insist on using ‘I'as your erase character you can sill run the cal command by typing. . ° Seal [UNIX - Getting Started 1 ‘The Vis called the escape character because it tums off any special meaning attached by the sysiem to the character immediatoy after it. This act is called escaping the character, Ifthe character immediately after V' has no special meaning then the "Vas no effect, So you can ‘ype S\N which will have the same effect as Sea Check Your Progress 3 1. What are the characters you need to use to conrect typing mistakes while logging in? Do these depend oa your erase or kill characters in your last login session? 3. Can you do away with the crase or line kill characters altogether (that is, no character i. ‘asthe effect of erase or line kill)? 4, Can you set both erase and kill to the same character? ‘What happens if you set your exase character to '\"? How do you escape special sharacters now? 2.2.4 Format-of UNIX Commands ‘We will now look at the general format of UNIX commands and take the opportunity to ‘study some simple commands, Let us go back to the C shell and the cal command which we ‘mentioned earlier. sD Goal . ‘This gives the calendar for the cufrent month and yesr (of course this will depend on what the system date has been set to and is what the computer will believe to be the current month and year), and you can use it to say, find out how many sundays are left in the month, Another simple command is dae : which displays the current system date and time, You will realise that the computer has no ‘way of knowing what the current date and time really are, so what it can tell you is only what itthinks is the current date and time, This can be set by the system administrator to almost ‘anything but in most installations, especially those that are networked with other computers, oD LUNDK Operating Syxtem-i Care is taken to see that the date is set correctly, The date in UNIX means date and time, so the output of the command is something Tike Wed Jun 15 13:44:39 IST 1994 ‘Notice that the time zone is part of the output, This is significant when you are on a network ‘Spanning time zones. Another simple one word command is % who kumar ‘tty03 Jun 15 11:49 hana 11y05 Jun 15 10:22 nilpos ty07 Jun 14 23:36 ‘This tells you the names of the users curently logged in to the system, theic terminal ‘numbers and the date they logged in, You will find that you will always be listed as one of the users, since you usually run the commands only when you are logged in to the machine, ‘There is another form of the who command which you cani now ty out. %whoami umacr tHy03 Jun 15:11:49 ‘This time we have given the arguments am i to the actual command who. The result is ‘Similar to thet obtained earlier, but now you are the only user listed, This command has the effect of telling you the login name of the user currently logged in at that termina, the terminal number of the terminal and the date the user logged in. Other users ofthe system are not listed. Arguments to commands are separated from the command by one or more spaces. It mis ‘seem silly to ask the computer who you are, but ifthe previous user has not terminated his session, you can find who it was by this command. But you would do well never to leave Your terminal unattended while you are logged in, as it would be a security lapse. Some versions of UNIX provide the command % who are you which is synonymous with who am i, but sounds much more intelligent. ‘You have now seen the general format of UNIX commands, which comprises of the basic command followed by zero or more arguments. The command and the various arguments are separated by one or more spaces and the whole sequence is terminated by the newline ‘character, which is produced when the ENTER key is pressed. ‘You can enter more than one command on the same line by separating the commands from ‘one another with semicolons like this % date ; who Wed Jun 15 14:02:11 IST 1994 Jamar 1303 Jun 15 a9 lps 107 Jun 14 23:36 crypto ny08 Jun 15 13:57 ‘The commands are executed one after the other in the order they were specified on the Command line, After the last command is over you get the prompt again. ‘Arguments to commands should not contain spaces otherwise the different words of the argument would be interpreted as different arguments by the computer. If for some reason the argument needs to contain a space, you must enclose the argument in double quotes (“) rin single quotes ('). ‘Most arguments to commands are filenames (discussed later in this unit), options or expressions, All of these could occur inthe same command, The exact order in which thee ‘UNIX - Getting Started T arguments are listed can depend on the command and should be ascertained by examining the documentation for that command. Usually options immediately follow the command with ‘the expressions and filenames coming next. You will see details of such cases later when we study more complex commands than the ones we have looked at so far. If an argument itself contains quotes of one kind you can enclose itn quotes of the other kind, Thus % grep -n “Ram Kumar's Salary” employee payroll looks for the expression Ram Kumar's Salary inthe files employee and payroll and prints the Jine numbers ofthe lines in which the expression is found. ‘Sometimes the shell places restrictions on the use of certain characters because it interprets them in some special way. To use these characters in arguments, you have to use quotes. The details ofthe C-shell are discussed in unit 4, r Check Your Progress 4 1, How cam you get the calendar for some other month in some other year? 2, Get the calendar for the year 1752 and look atit'Is anything the matter? 3, Find out how to set the system date. Why do you think only the super use is allowed to do this? 4. Stuy the who command and use it to find the dase tre machine was started up, and also how many users are cunently using your system. 2.2.5 Changing your Password } ‘You saw earlier that your password was the only way of preventing somebody else from | using your account on the system, Without it anybody who knew your Jogin name could walk up to the machine and start using your account, This would be really serious in the case of the super user or root. ‘When you are first given your account you are told what your password is. On some installations your account is set up without a password and you are asked to choose one for yourself the first time you login. This can be done wth the command % passwd ‘Changing password for kumarr ‘Old password: pi, 14 ‘New password: expl=2.71 ‘Re-enter new password: expt=2.71 ‘Note that unlike the commands you saw 80 far, the passwd command is interactive, It asks ‘you to enter some information rather than doing all the work by itself. The first thing it asks {for is your current (or old) password. This is to make sure that somebody else cannot change your password while you have left your terminal unattended. If the wrong password is [UNIX Operating System-1 ‘entered here, the computer says Sorry and gives you back the prompt, Ifyou enter the old password correctly you are asked to type in your new password. After that you are asked to enter it again. If you type two different things here, tho system tells you that they do not match and asks you to try again. If you keep getting mismatches the ‘command terminates aftr telling you to try agnin later. This is because if you cannot change ‘your password you are unlikely tobe able to enter it comecty to login. Although in the ‘example above we have shown the passwords, in an actual session none of the passwords willbe echoed. Your system will probably have restrictions on what passwords you can ‘choose. The password should not be too short or too long. You should change it periodically ‘0 that if someone fas been using your account by laying hands on your password, they ccannot continue to do so indefinitely, Ifyou are wondering how the passwords are stored on the machine such thet even the super ‘user cannot find out what your password is the answer is that UNIX encrypts your password before storing it. This means that what is stored on the computer bears no esomblance at all {to what you typed in as your password. When you try to login the next time, UNIX again encrypts the password you type in and compares it with what has been stored, If the two are the same, you are allowed to login, otherwise your attempt is blocked. So while a super user, ‘or anyone else for that matter, can read your encrypted password, nobody can find out what the actual password is — at least not easily. ‘There is another form of the passwd command which is used to change the password of some other user. By default the passwd command allows you to change the password of the user ‘ho is logged in to the terminal. So to change the password of Khanz, you could say % passwd khan, ‘where the name of the user whose password is to be changed forms the argument to the ‘passwd command, The rest of the behaviour ofthe command is just as before, You will now Tealise that you can change the password of any user, including your own, only if you know his current password. On your system you might simply get the following message if you try to change somebody else's password Permission denied. How then does root have the power to change your password? Ah! When the wer exceating the passwd command is the super user, UNIX does not ask it to supply the old password, ‘This is how the super user can change your password to anything without knowing what itis currently. “Check Your Progress 5 1. Can you ran the passwd command and again set your password to what it already is? 2. Can a friend (not the super uscr) help if you have forgotten your passwort? ‘Since your password isthe only way of protecting your account, you must take care to ‘choose passwords well, that is, choose one which cannot be easily guessed. As a general rule do not write down your password anywhere and let it be locked up in your head. Do not choose the names of me. shers of yout family or close friends or your dog, Infact it is best to ‘avoid all proper names and ail the words in the dictionary. Let your password have a few spocial characters init (not those which have anything to do with terminal settings). Some ‘UNIX implementations enfores rules like these when you set your pessword. 2.2.6 Characters with Special Meaning ‘As mentioned before some characters are intexpreted in a special way by the C-shell. These ‘meanings will be discussed in detail in the next unit of this block, but that apart, there are ‘certain characters you will find to be useful. For example, suppose you start a command which takes a long time to execute, and you ‘change your mind and do not want to wait for the command to finish, You can abandon or ‘UNIX - Getting Started 1 break a command in between by pressing the BREAK character, which is the key marked DEL on most systems. This key can be set using our old friend, the stty command % sty inte AC o seis the INTERRUPT character to CONTROL-C. ‘Again, consider a command which produces a lot of screen output. This could happen if you ‘were typing outa long file, for example. The output will probably be dumped on your terminal far too fast for you to read. To stop output onthe screen temporarily, type "S, You ‘can restart the output by typing 4Q. Another special character can be used to terminate your login session, This is SD, which indicates to the shell that there is not going to be any more input from you. So UNIX logs You out and again displays the login message on the screen for another user, or you again, to login and start a new session. You can also logout by saying % logout or exit (Check Your Progress 6 1, The commands you have leamt so far produce only a smalt amount of screen output How will you produce output which does not fit into one screen, usin only the. ‘commands you have learnt so far? 2, How does the BREAK character differ from KILL? 3. Can you transfer the functions of S, AQ and 4D to some other characters? 2.2.7 UNIX Documentation {UNIX comes with copious documentation, some of which is often available on-line. You should learn how to use the UNIX manuals, While we will not discuss this topic in detail ‘here, you will have to acquire this skill if you want to obtain a good widerstanding of UNIX, ‘This is because in this block we do not have the space to consider any but the most basic ‘commands, and even those only briefly, We will not even be able to cansider all the options available with many of the commands that we do discuss. The only way for you to master ‘them will be by consulting the documentation. ‘fany documentation is available on-line at your installation, you can look up the manual entry for a command by using the man command. For exampie, to learn more about the who ‘command than what we have talked of, say % man who ‘You can similarly lear more about the date, cal ar aay other command, So to learn more - about the man command itself, cay % man man Ifthe documentation is not on-line you will have to use the printed UNIX manuals, UNIX Operating Systom-1 4 ‘Check Your Progress 7 1, Look up the manual entries for alf the commands we have studied so far. What do you fel about the numberof options available with each command? 2.3 FILES AND DIRECTORIES Im this section we will describe the file and directory structures of UNIX. Just as a paper fle is something into which you can put papers and bunch a group of papers together, a UNIK Sile is something into which you can put data A file has a name, and this name is a property ‘of the file rather than the data preseat in it at any given time. Itis possible to change the data ‘in afile. This act does not affect the name ofthe file. Thus UNIX commands can be made to ‘operate on the data in a file as a group. Afile usually exists on the hard disk(s) of the computer. This will be the case when you are Jogged in to the machine and are engaged in a session. The actual areas of the hard disk used bya file can change as the file is increased and decreased in size. As you will sce later, the size ofa file in UNDX has a precise technical meaning, and the size of a file does not ‘necessarily tell you the actual amount of data init ‘UNDX has three kinds of files— ordinary, directory and special, You have already gotan idea” (of what ordinary files are. Special files will be discussed in unit Sof this block. Directory files contain information about other files, including other directories or special files, A directory groups its contents together hierarchically under itself, and a directory ‘within a directory is called a subdirectory of the directory at the higher iovel, also called the paront directory. Thus in UNIX the file system is like an inverted tee of directories, starting ‘at aroot and going down to an arbitrary depth of hierarchically arranged levels. ‘We will now look at some of the files in UNIX and learn how to use the file structure, 2.3.1 Current Directory Every user who is given ah account on a UNIX system is also given a directory where he ‘Teaches on logging in, This directory is also called the home directory, The current, working ‘or current working directory is the directory in which you are currently located. On logging in, your current directory is normally your home directory. You can find out what your ‘current directory is at any time by using the command % pwd fusckumarr ‘This means that your current directory is called kumerr and is located under the directory usr, ‘which is in tum located under the root directory. Of course the actual home directory you are allotted will depend on your installation, By the way pwd is onc of the few UNIX commands ‘which do not take any arguments or options. ‘The output that pwd disy.zys is called the fall pathname of your cusrent working directory. ‘This is also known as the complete or absolute pathname, tha is, the pathname starting from oot. You can refer to your directory by just saying kumar, But this is not unambiguous because there can be another directory called kumarr under some othec diroctory as well. But no two directories or files on the same UNIX machine can have the sare complete or full pathname. The various components of the path are separated from one another by slashes ). ‘We have not yet talked of what a valid filename can be, Actually in UNIX there are no restrictions and a filename can have any characters upto a maximum of 14. The same rules apply to directories as well. In some UNIX implementations filenames can be of any arbitrary length. In practice itis best to avoid certain characters in filenames because they hhave special meaning to the shell. Check Your Progress & ‘What ionld happen if your bome directory did not cxist nd you Wied to login? 2.3.2 Looking at the Directory Contents. ‘We will now see how to look at the contents of a directory. The command is ls ‘This gives you a listing of all files in the current directory. If you have just been allotted your ‘account and are logging in forthe first time, you will be in your home directory and that directory will be empty, that is, there will be no files init, 1s has several options and it will take you some experimentation to understand them all. The first option we look at is isa ‘This is your fist taste of UNTX options, so look at the command line carefully. The ‘command Isis followed by atleast one space after which the hyphen or minus sign {introduces the option letter. The ~a option tells UNIX to list all files including those that are “hidden’. Hidden files are those which start with a’ character. Unless the -a option is used, ‘snever lists such files in its output. The output of isis always sorted in some order, the default order being alphabetical. This sort order can be altered by other options to Is which we will take up later. Ths is why the file (actually a directory) "i listed before". in the ‘output, ‘The’. refers to the current directory and’. to its parent. These are pronounced dot and dot dot respectively, In this case." refers to the fus/cumarr and.” to fast. The directory "7 or Toot sits own parent. This output is of course not very interesting because your home. directory is devoid of files and you do not yet know how to create any. So let us look at some other directory. You can get the listing of any dircctory by supplying its name as an argument to ls. Thus to look at the directory listing ofthe root directory use the command Gls! sardvark, bin dev ote tb Tosttfound imp usr ‘We mast caution you that iis very unlikely that you will see the same listing as shown here. Itis self evident that the listing will depend completely on the machine you are working on. ‘However there afe some files that will surely exist on the root directory of a working installation, The directories from bin to ust are such files, As you have seen the Is command lists one file per line of output. To see several names per line you can use %ls-x aardvark bin dev ete lib 1 ostefound tmp usr ‘Now the output is sorted from left to right on cach line. Another variation is the ~C option ‘UNIX - Getting Started ‘UNIX Operating Systamt 26 which sorts down each column GC aardvark doy «bmp usr bin ete lostefound ‘You might have found your output to be in one of these forms the first time itself. This would have been because your system was configured to make the -x or -C option the default ‘option for Is, ‘From the outputs so far you can get no indication of whether the files shown are ordinary files of directories. For this you can use the -p option, which appends a ‘f to every filename which isa directory. The ‘/ is not par of the name, $0 do not get confused. For example ‘ls-Cp/ earivark — dev/liby-—tmp/ us bing ete —_lost+found/ Another such option is -F which also appends a "to every filename which is an executable file, that is, a command. Try it out and see whether the result differs from the -p option. Tr you haves really lage directory you might want to use an option of ls which compact output %\s-m/ aardvark, bin, dev, et lib, lost+found, tmp, usr ‘This gives you the filenames separated by commas. ‘You can see from the above thatthe contents of the root directory consist of both directories and ordinary files. The directories here, or anywhere else, can themselves contain ‘subdirectories. To see the contents of fase, you can say 9618 -xp fase bing Khane/ = kumar? iby. omp/ ‘On most systems you will see the names of user accounts in this directory. The -p or -F ‘options will show you that they are directories. You must have deduced that you are sccing the home directories ofthe users. You can also see your own home directory here, But wait ‘When you logged in and checked the name with pwd ase/cumare ‘you found your home directory specified differently. Why is this so? We have seen in the last section thatthe pwd command tells us the full, comptcte or absolute pathname of the current | working directory. When we look at the contents of fusr, kumarr is merely one of the directories under it, and is shown as such, To get the complete pathname we must specify the | receding portion which is fust. Thus the full or complete pathname is fust/kumarr, ‘will now be easy for you to realise that the bin you saw listed as one of the contents ofthe rootdirectory, that is,’ is different from the bin listed under ust. The former has the full pathname /oin, whereas the complete pathname ofthe latter is usr/bin. You can now look at the contents of the other directories and ty specifying their complete pathnames, You can also try looking at their contents by providing relative pathnames, We will look at complete and relative pathnames again in the next section. You would do well to understand pathnames, relative and absolute, thoroughly as that will be necessary in navigating around iedirectory tree, * Bul let us now get back to our friend the Is command, One of the most useful and often used ‘options is-l, which gives the socalled long listing ofthe directories asked for ist) cwwarxrx 1 root root 1298 May 1409:26 aardvark dewxrxrx 2 bin bin 1248 Jan011970 bin ‘Now this isa complicated looking output, so let us try and understand the meaning of thi listing. The first column of uke output tells you whether the file isa directory or not. A= ‘means that itis an ordinary file while a directory has a "din that position. So you now know another way of telling whether a file is a directory, apart from the -p and -F options you have already looked at. The other 9 columns in the first field tell you about the permissions on that file. We will look at these in detail in section 2.4.6, ‘Tho next field in the output is a number indicating the number of Links to the file, For a file this shows the number of names it has. In UNIX the same physical data may have several ‘names, although it must have at least one, Eacivname is a link to the file. Usually ordinary files have only one link, but if there are more it docs not mean that there are that many copies ‘of the data in the file. There is only one physidal:copy of the data which can be referenced using any of its names. In the case of directories the number of links tells you about the ‘number of subdirectories it has. ‘The third field of the output shows the owner of the file, Root and bin are names reserved by ‘UNIX for its use as we have seen earlier. In some cases you might see a number like 207 instead of the user name, ‘The next field is the group name and in certain situations can be a number in the display. The ‘user isa part of the group shown here, ‘The fifth fied is the size ofthe file in bytes. You already know that the size ofa file in UNIX hhas a precise meaning which is unrelated to the amount of data in it. However, do not be alarmed because in most cases the intuitive meaning of size does hold good and the figures ‘you see usually do represent the number of bytes of data in the file in question. ‘The next item of information is the date the file was last modified, and in the end the name of the file is shown, ‘You now know how to find out many useful things shout the file. You should now lock at the Aiscctory long listing of the various system and uiner directories on your machine. In the course ofthis when you look at /bin you will see many familiar names, For instance, who, pwd and Is itself will be found in the /bin directory. Actually /bin is where many of the binaries or executables of the commands are to be found. There are other commands locaiet under usr/bin and fetc as well, We will now briefly look at three other options to the Is command. When a directory is given ‘as an argument to 1s you get to sce the contents of the directory. But suppose you want 19 check the permissions on a directory, say /usr/kumar. If you try 9% ls -1fusefkumare you will see nothing because 1s tries to list the contents of the directory and at present there is othing in your home directory. To see the desired output you could say %ls-Afase whereupon kumarr would be one of the entries. But this is awkward. The answer to this isthe -d option % 1s -Id fuse/eurmare which lists fusr/cumarr as a directory and shows all the information aboutit. ‘You have seen that ordinarily subdirectories are shown only as single entries and any files inside them are not shown. To look at the contents of a directory and recursively of all subdizectories within it, use -R Gels-R use will show the conteits of fusr and also recursively of every subdirectory inside it, down to ordinary files. Thus using ls-R/ ‘you can see every file and directory on your system. Another option is the reverse option. The -r option reverses the sort order of files displayed by Is. You can try this with any option Gls) UNIK - Getlng Startod I ‘UNIX Operating Syetom-I ee ‘So far you have given only directories as arguments to Is, but you can give it an ordinary file fas well, It then lists only that exists. Moreover you can give any number of files or | Girectories as argument to Is and it will list whichever ones exist. If you feel out of breath after looking at these options, there are a few more we have not ‘ooked at. You are encouraged to look up the documentation for Is and experiment with them. Many UNIX commands have zillions of options — getting used to them all requires ‘time and effort. But you will find that you soon get to know the options you use often. Its probably best, when learning a new command, to concentrate on a few useful looking options only. As you usc them frequently you will get toknow them well. Then you can spend some time deepening your knowledge of the command by trying out the other options. Most beginners get overwhelmed by the large number of options and do not know where to ‘wart or when to stop. You will have to work out a method which suits you. Maybe you are the type who likes to learn everything about a command at one go. But many people, ‘including the author, find that building on a solid foundation of already know options is easiest. (Check Your Progress 9 1, Read up on and try out te other options to Is. What isthe resuit of Is-Im? Which option ‘takes precedence? What is the result of ls «71 2. If your system has the -x or -C option set by default how can you get the standard Is listing? 3. Your terminal most likely has an escape sequence to clear she screen. Find out what the sequence is and create a file whose name contains the sequence, Thus ifthe sequence is ESC(2S, you can create a file calléd 7aESC[2Ibc. Now waist try and Jook atthe listing of your directory. What do you see? How can you look at the listing? 4. Ht you see two files abcd and abod in a directory listing, thats, two files with apparently be same name, what would you coxclufle? How woald you confirm what you surmise 5, On asystem which allows fienames of arbiwary length, would you rather store daa in tho filename rather than in the fil itself? 2.3.3 Absolute and Relative Pathnames ‘You saw inthe last section how pathnames could be relative or absolute, Since the UNIX file system is logically structured like an inverted tree, it is important to understand how to specify pathnames. Both methods can be used and in UNIX it does not matter which ‘approach you use in identifying the file you mean, as long as you are careful about specifying it comectly. However there ae situations where one or the other approach is more Convenient. So you should take the trouble to assimilate the concept and learn how to navigate around the file system with felicity. Let us look at atypical directory hierarchy on a UNIX machine. oe Ae Of course the exact layout of the directory hierarchy on your machine is likely to be different. We will soon be looking at some of the main directories and files on a UNIX system. For the moment though, just concentrate on learning how to move around. You already understand what is meant by the current directory, This is the directory in which you ‘are located at any given time. If you say Is, itis the filenames in the current directory that are brought up for you to see. If you have logged in as kumarr, you will probably land up in ‘/usrfkumarr when you get your prompt unless it has been arranged otherwise, ‘Now considera file in fusr/kumarrfilp like augefy.C. Suppose you want to see the size of this file alone. For this you need to use the Is command and provide the filename as an argument to it. In UNIX you can provide a pathname (relative or absolute) as an argument to command wherever you could otherwise provide a bare filename. So that actually gives you three ways of accomplishing what you want to (we will assume that you have the ‘required permissions— this will in fact, be the usual situation) do. ‘Let us first use an absolute pathname. So you have to specify the filename stating from root "7. Thus your command needs to be % Is /asr/cumare/nipfaugefg.C ‘You have already used this method in the last section, The second way is to use a relative pathname, where you specify the pathname relative to where are currently. Here you only ‘need to remember that.” sands for the parent directory of the current directory '.", So if you ‘are at /asr/khanz, you can say % ts -1 .Acumarr/nlp/augefg.C ‘The ".” takes you one level up, tha i, to /use, From there you continue naming the file 88 ‘before. Of course you could have used the following rather convoluted way ‘% ls -1.J.Just/kuman/nlpfaugefg.C ‘This is inefficient because you implicitly nove to root before naming the file. The first’. takes you to /usr and the second "." tales you one level higher, to "P or root itself. Then you UNIK - Getting Started UNIX Operating System-I 30 begin your descent until you reach the file you desire. Here it would have been better to use ‘an absolute pathname instead of this, for then you would not have had to use two steps 10 each root. ‘Usually 2 filename is specified by the method that results in the shortest possible specification of the name. This depends on whether the filename is closer to you ot to the root directory. Thus if you are located in /usr/chanz and you want to specify a file inthe Girectory /usr/kumarr, it i easier to say ./sumarr rather than /usr/kumarr, ‘There is a third way of looking atthe size of augofg.C. For this you will have to leam a new ‘command ed, which lets you change your current directory. This command can be given an argument which is your intended destination and it then changes your directory to what you asked, provided you have the appropriate permissions. And how do you specify your desired Aestination? By specifying the pathname, of course, The pathname can be specified, as you % cd fascikumarr/ntp. or % ed .fcumarr/nlp ‘and then look atthe size by lst augerB.c ‘This really amounis to specifying the filename relative to ust”kumazr/nfp, the current irectory. In general when you specify a bare filename you are specifying the filename relative to the current working directory. So the command above is really a shorter way of saying %ls-1 Jaugcle.c One form of the ed command can be very convenient if you have wandered far off your hhome directory and you want to retum there, especially if your home directory happens to be far away from the root directory. This is ed ‘without any arguments. It always brings you back to your home directory irrespective of ‘where you are, even if you were there to start with. ‘Check Your Progress 10 1 Goto the root directory and then try to go to its parent with ‘ed ..”. What happens? What do you conclude? 2.3.4 Some UNIX Directories and Files will be useful and interesting to get acquainted with the UNIX system directory structure, ‘We will now look atthe layout and contents ofthe UNIX system directories and understand. hhow the various system files are grouped under directories. We will also learn about the functions of some of the system files, The UNIX directory structure is typically as shown in the earlier figure, ‘We again emphasise that only some of the system directories are shown here. Your machine ould have a somewhat different organisation, How will you find out the directory tree for UNIK- Getting Started L ‘your UNIX system? You can now explore the files on your machine, ‘The directory /bin contains, as you have already seen, the executables of UNIX system commands. These include the commands you have learnt so far, like 1s, cd, pwd and who, You can look atthe long listing of this directory and note the information provided. Look at the sizes to get an idea of the sizes of executable files on your machine. These will depend, among other things, on the architecture of your computer, ‘The Alev directory contains device special files concemed with hardware devices like printers, terminals and hard disks, You will learn more about these files and the /Aov rectory later in this unit. ‘The ete directory, as the name suggests, has several miscellancous files and directories, It contains many commands which are reserved for the use of the system administrator, Ordinary users cannot excoute many of these commands. Apart from this, the etc directory also contains some text files. Let us take a quick look at some of these text files tc/issne Contains the message before you login /etc/motd has the text ofthe message you see just after you login. /et/groap has the names and group numbers of all the groups in the installation, /eic/passwd contains the login name of each user, his user identification number, his encrypted password, his home directory, the default shell when he logs in and other information about hima, In some cases the password is stored in another file called Fetcishadow. ‘Aib contains system libraries used with your °C” compile, /Amp is used to store temporary files. Some UNIX commands need work space in order to execute, this is where they create their temporary files, This directory is cleared out periodically on many installations. In any case any files you pat here can be erased without warming. So do not try to store anything hhere on a permanent basis. Keep files important to you under your home dircetory only. /vsrfbin, as you have already found, holds UNIX system commands which are more of utilities, although there is no clear distinction between commands in /bin and those located here. Jusrfinclude contains header files used in writing C programs. /ust/games holds games distributed with UNIX. This might not be present on some insallations. Jusrflocal/bin is often present as a repository of local commands, often developed by local talent. These are commands of interest to and found convenient in that installation. Whill looking at the jus/include directory, you must have noticed that all files have names ‘ending in ".h" and similarly you will find many files with names ending in *.a° in fib and Juseflib. Although we said earlier that UNIX places no restrictions on the characters you can use to constnucta filename, there are some conventions followed in a few cases. Usually filonames ending in * xyz’ are referred to as * xy?! files or even as xyz files, Such conventions are not enforced by UNIX, although in many cases standard UNIX utilitics might do so. Thus h files are C or C++ program header files, C program files end in '.c? (enforced by co), C++ program files end in °C’, Iex source files end in ‘I’, yacc source files end in “y', assembler source files ins", object code files erin ".0', library archive files in *.a', SCCS (Source Code Control System— to be discr’-2d im unit 4) files start with’'s.’’p.” and so on. ‘The file command is useful in determining of what type a given file is. This command takes ‘any number of files as its arguments and tries to determine the type of each. Although itis ‘not hundred per cent reliable and is open to deceit the command usually does a good job. ‘Chveck Sarwr Prvgeess AL P 1. Leck up the 1SNIK dacumentation fur he wattous uiltes above:and find out which of them enforce file naming conventions, 2. Rumthe file command on various kinds of ilesfrom the various directories you have eonand sce if their, types.ane ecported.coracctly. ‘UNIX Operating System-I ———-:-- 24 SUMMARY nee {In this unit we have started atthe beginning and looked at many basic UNIX commands, However there are many useful commands we have not been able to examine. You must refer wo the manual and leam these. You should now be knowing enough about UNIX to Se 2.5 MODEL ANSWERS ee (Check Your Progress 1 1. Yes, any number of persons can use the same user account on a UNIX system. A UNIX ‘system does not try to identify a person physically and there is no constraint on more shan one person using the same account. So as long as they all know the password ofthe ‘account they can always use it 2. No, there can be only one account under one name, In fact, it isthe name ofthe account ‘which defines the account, 3. Yes, any number of accounts can have the same password. ‘Check Your Progress 2 1. The system thinks you are on a terminal which understands only capital letters and. ‘responds in upper case, with actual upper case letters escaped by preceding them with a ‘backslash () character. You can now type.a AD to get back to the normal mode at the LOGIN: prompt. 2. You will not be able to get the prompt unless you are able to guess his password. Thus, ‘your own password should be such that itis not easy to guess. 3. There is no difference in the system's response, This is so that. an intruder (Who might. ‘not know for sure whether a particular account exists) does not get any information about the existence of an account unless he is able to loging. 4. Five possible reasons are: There might not be enough free disk space to let you login, (©) The system wide limit onthe number of processes that can be run concurreatly ‘ight have been reached, (©) The per user limit on the umber of processes that can be nan might have been reached, (The limit on the number of concurrent users laid dowm in the operating system software licence might have been already reached. (©) There might be » hardware malfunction, Check Your Progress 3 1. The usual characters can be used. These are not affected by the erase or kill characters in ‘your last login session, 2. By entering two "V characters one aftr the other ("W). The frst 'V escapes the second, which thereupon loses its special meaning. 3. No model answer, : 4. No model answer, 5. No model answer. ‘UNIX - Getting Started 1 Check Your Progress 4 1. Say % cal mm yyy where mm represents the two digits of the month and yyyy the four digits of the year (in both cases leading zeroes can be omitted), 2. No model answer. 3. No model answer. 4, The command % who-b tells you when the system was booted. To see how many users ure logged on, say % who-4 ‘Check Your Progress 5 1, This will depend on the particular favour of UNIX you are using, Some versions allow it while others might not. However the super user can set the password to anything he ‘wants without restriction, 2. An ordinary user cannot help you if you have forgotten your password because he ‘cannot look at or change your password uniess he knows it already. Check Your Progress 6 1. No model answer, 2, ‘The BREAK character terminates a process entirely, while the KILL character only cancels what was typed by the user on a line, so tha it is not sent 1 UNIX. 3. Yes, you can do so, Check Your Progress 7 1. Nomotel answer. Check Your Progress 8 1. You would not be presented with the UNIX prompt and would be returned to the login Prompt with the message: No directory Check Your Progress 9 1, Nomodel answer, 2. Nomodel answer, 3, The Is command is executed and the fites are listed as usual, As soon as the file 7aESC{2Ibc is to come, the screen gets cleared because it sees the sequence to clear the sereen. You can say Is -b or Is -? to look at the listing without the characters getting interpreted. Some terminals might have a way of setting the hardware itso $0 that it ‘does not interpret control sequences and displays them, But if you set your terminal 10 that, you will not be able to clear it at all, because the effect of the hardware setting is Tot limited to the duration of the Is command, 4. ‘This can happen if one or both of the files have non- printable characters embedded. You ‘can again us Is -b or Is -? to look at the actual names of the two files, and those are ‘guaranteed to be different. 5. No model answer. Check Your Progress 10 R [UNIX Operating System |. Nomodel answer, 2. Nomodel answer, 3. Yes, there is nothing to prevent you from having file (or another directory) under a directory of the same name. Check Your Progress 11 1, No model answer. 2. No model answer, UNIT 3 UNIX- GETTING STARTED IL Structure 3.0. Introduction 3.1 Objectives 32 Looking At File Contents 3.3 Your Own Directories 34 File Permissions 3.5. Basic Operations On Files 3.46 Changing Permission Modes 3.7. Standard Files 374 Sundard Out 3.72, Sunda Input 373 Sunducd Emoe 374 Fier ma Ppetines 38 Processes 3.81. Fading Out About Pocenes 3.82. Stopping Background Proceses 39° Summary 3.10 Model Answers 3.0 INTRODUCTION a i ee In this unit, you shall be introduced to some basic concepts of UNIX like files, directories, Processes and standard files It will be sufficient to allow you to perform many simple tasks. All the above features are illustrated with several realistic examples. In the course of this and the following units you will have occasion to see in action many of the theoretical concepts you have studied about operating systems, You will find it interesting and instructive to correlate theory with its implementation in UNIX. 3.1_ OBJECTIVES Atthe end ofthis unit you shall be able: To leam to use some simple UNIX commands, + Togo through the file contents ‘+ To change the permission modes ‘+ To understand the concepts of filters and pipelines + To understand what a process is and how it difers from a program. 3.2 LOOKING AT FILE CONTENTS ‘So far you have not had occasion to look at what is stored inside a file, You have only seen the filenames in a directory listed by the Is command, or moved from one directory to another with cd, or the type of a file shown up by the file command, To look at the actual Contents of file like /ete/motd, say % cat fetcfmotd ‘This prints the contents of the file on the terminal. The cat stands for concatenate and agtualy cat can be given any number of filenames as arguments, The effect of cat isto join all the files in the order specified in the arguments and 0 send them tothe terminal screen, ‘When only one file is specified the effect isto type that file on the screen, It makes sense to cat only text files to the terminal for otherwise the result might not be readable. If you cata binary file, that is, one which can contain any arbitrary characters including control characters, you are likely to get strange results. Your cursor might jump UNIX Operating System-i hither and thither on the screen, the screen might get cleared from time to time, the terminal bell might sound or your keyboard might get locked. These effecis are caused because the cat ‘command is sending the contents of the file to the terminal. Ifthe file contains character ‘Sequences that have special effects on your terminal, you will observe these effects occurting Spontaneously. While this will not do any harm, it can be disconcerting or annoying and in any case will not probably be very useful since you will not be able 10 make sense of what ‘got typed on the terminal, Ifthe text file you type is a large one, the output will not fit on your screen, You have already seen that you can use AS and “Q to control the stopping and resumption of the output and thus read what isin the file, However ifthe file is more than a few screens long, it will be difficult for you to keep suspending and resuming output again and again. Moreover you will have to operate the keyboard dexterously if you want just enough of the output to scroll by but not too litte or too much, If you area trifle slow, some unread output might disappear before you are able to read it. ‘To look at a text file screen by screen, you can use the more command. % more /etchermeap This command shows you one screenful ofthe file and then pauses until you ask ito ‘proceed further. You can now examine the contents at leisure, When you are ready to look at the next part of the file, you can press the spacebar for seeing the next screenful. You can move one or more lines forward by pressing a number before the spaceber. You can also do this by pressing the numberof lines to move forward followed by the RETURN key. Note that normally commands taken by more do not require you to press the RETURN key, so that 4s soon as you have pressed the spacebar the display moves forward by one screenful. You an search fora text string by saying '/ followed by the search pattorn, The display will then ‘Scroll forward to show you the screen where that pattern occurs next. In general you cannot move backwards in the file by using this command. To terminate the display of the fil, just Press °q’ and you will get back your prompt, ‘There are several other options available with the more command. You should refer to the documentation and experiment with all the options so that you understand them thoroughly. ‘There ts another command called pg which has a similar function, You could use this if more is not available or is not to your liking. You can compare both of them if both are available ‘on your system. The ability to look at files in manageable screensful is often very useful, so >be sure to know at least the basic options to these commands. ‘There is one more command that is useful when you want to look at only the last few lines of ‘file. To soe the last 20 lines of a file you can say ‘% tail 20 fetctermeap If the number of lines is omitted a default value of 10 tines is assumed. So you now know ‘how to look atthe last line of a file. You can also use tal to start looking ata file from any line number. For example % wil +15 fetchermcap will show the contents ofthe file /ete/termcap from line 15 onwards. You could also use the ‘more command to achieve the same result % more +15 fetcftermeap However there is a difference between these two commands. ‘When you use more, the calfile Now calfile will contain the calendars for the curent month as wel as Jane 1994, Compare this with \ % cal 06 1994 calfile which leaves only the calendar for June 1994 in calfile: ‘Tins the > sign is sfor to use because it never destroys any data, but this operation will keep ‘Adding tothe file, and itcan sometimes be difficult to make out what part ofthe output was ‘Produced by your last command and which portion is the outcome of previous redirections or ‘was simply the original contents of the fle. Checle Your Progress 7 1. What happens if you redirect the output of an operation on a file to the same file? For instance, find oat the result of % cat cate catfile 3.7.2 Standard Input Just as many commands produce output on the screen, some commands take input from the Keyboard although most take input from files. Look at an aspect of the eat command you have not studied so far oat UNIX. Getting Started IL ‘The result of this is deafening silence. The uninitiated might wait soveral minutes before aborting the command, thinking there is something wrong because the system does not ‘appear to be doing anything at all. The truth is that cat can take its input both from the standard input as Well a from a file. However the outpat is always produced on the standard ‘cutpot. If any filenames are specified they are used as the input but if none is mentioned the inputs taken from the standard input. There are also some commands which take input only . from the standard input. In the present case no filename has been specified and cat is waiting for input from the standard input, the keyboard here. So if you type something cat writes it out to the standard | ‘output and the effect is that of echoing your input. A foolish consistency is the hobgoblin of litte minds — Emerson A foolish consistency isthe hobgoblin of litle minds — Emerson (Actually if you had given the command just as shown then the above isnot strictly correct. ‘The cat command will buffer your input and when that buffer is fall it will straightaway write itout onto the standard output. So you will probably find that you have to type several lines. of text before you see it again on the screen, However if you say Fcat-u the result willbe just as described, for the -u flag calls up cat in unbatffered moée,) If you ‘want to put an end to your misery you can terminate your input fle by saying Ad, thereby causing cat to finish and present you with your prompt. ‘To redirect standard input, say ‘ % cat catfileste Whereupon cat will print the contents on the screen. This just the same as Ge cavcatfilesr because cat can take its input from a file as well. So to copy this file to catfiletarget, you can say % cat catilesre catfiletarget or % cat catfilesre catfiletarget ‘Thus you can redirect both standard input and standard output in the same command, Some | commands do not take input from the standard input. In such cases redirection of the input is not possible, as with the ls, ep, mv, rm or who commands, ‘Check Your Progress 8 1. How can you create text files with cat? Can this method be used to alter an existing file? Remember that redirection isa facility provided by the shell, not by the command. Tht com- ‘mand being run docs not know or care what its standard input or output are connected to, and itcontinues to use them. So the command has to be designed to take input from the standard input if redirection of input is to be possible. Thus you cannot say % ep cpsrefile ‘because ep does not take its input from the standard input, Similarly output redirection is not possible unless the command is designed to write to its standard output. 3.7.3 Standard Error ‘So far we have seen the effect of redirecting the output of some commands that completed Rw UNIX Operating Systom-t 50 ‘Successfully. Let us look at this a bit more closely. For example, if there is no command like gah, say gah gahiile If you do so you will find that you get a protest message from UNIX on the terminal but that Bahfile is empty. Similarly %ls-l gah Isfile Produces a message on the terminal but nothing in Isfile. Why does the redirection fail? After all the command did produce output, ‘The reason i that there isa third standard file in UNIX, called the standard error. UNIX ‘Programs and utilities are usually designed to provide error messages in case there is something wrong and the program is not able to proceed as expected. Such messages are ‘often referred to as diagnostic output because they can help the user diagnose the reason for failure. This kind of output is usually writien to the standard error file. Usually the standard «error is also connected to the terminal by default, but like the standard input or output, the ‘standard error can also be redirected, To do this inthe C-shell say gah &gahfile ‘This will place both the standard outpat and the standard error in gahfile, Here it will have only the ezvor message telling you that there is no file called gah, How do you place the Standard output and standard error in different files? Well, this is easy to do in the Bourne or Kom shells, but in the C-shell the way to achieve this is somewhat convoluted. So we will ‘not look at it right now. You are referred to unit 4 on shell programming for this, 3.7.4 Filters and Pipelines A ilieris a. command which can take its input front the standard input and can produce Outpat on the standard output. Having the capability to read from or write to files is not a disqualification. So 1s isnot a filter because it docs not read from the standard input but eat is ‘one because it can do so (although it can read from a file as well) and also writes to the standard output, ‘You can think of a filter as a “device” placed between the standard input and the standard ‘output which filters the standard input before placing it on the standard output, In the case of ‘at there i no filtering action at all, but a command like grep does perform some weeding. action on its output. i ‘The standard output of a command can serve as the standard output of another, Several ‘commands can be chained together like this, Such an arrangement is called a pipeline, Pipelines are one of the big strengths of UNIX, because they often enable us to group several isting commands quickly to perform a task for which there is no command directly available, A major design goal of UNIX was to have an operating system which allowed easy sharing Of data and programs, and allowed people to build on the work of others instead of having to o things from scratch. The facility of pipelining helps meet this goal because you can piece? together commands written by diferent people to achieve your objective rather than wasting _your time on doing things which have already been done, Let us take a simple example, ‘Suppose you want to find out how many of the files in a directory are directories rathes than ‘ordinary files. It would have been wonderful if there were an option to is which did this job, but since that is mot the case we will have to try something else. One way is to look atthe listing with Is -p and count lines which end in Such a visual method is tedious and prone to rvor especially if there are many files in the directory. So let us try to make UNIX do this for us. How about the following? Sels-p imp % grep -c'S/ timp ‘We first get the listing in a temporary file tmp and then count the number of occurrences of / £1 the end of a line in tmp using the grep command. The result will be available on the Standard output. While this method will work it has a few disadvantages, One is that iis stow because an intermediate file has to be created. Secondly we cannot start the grep ‘command before the Is finishes, Also if we run many commands like this we will bo left with 1159 o7 o-02 Is [UNIK - Getting Started TL 1168 7 0:09 co - 1276 o7 0-01 ps ‘This means that you have three commands running at the moment, The csh is your login shell with process-id 1149 and started from terminal number 7. The command has used up 47 seconds of computer time, There are three other commands running and ls and cc are probably in the background. I is easy 10 see that ps is always one of the commands running. ‘Thus one knows how many background processes one has set off, although some of the [Processes ps shows might have been created by commands you ran and not by you explicitly. ‘This simple form of the ps command gives only the first word of your command but will show all your processes including those you might have started from some other terminal. ‘pshas quite a few options but we will look at only a few more. The -1 option gives more information on each process while the -f option gives the full command tine including all arguments with which it was invoked, The -c option shows all running processes instead of just your own, 1 you run ps and find that a process you started in the background is no longer listed, it ‘means thatthe process has completed or that it aborted for some reason. Also remember that ps gives a snapshot of the state of processes at some: time, and that by the time the output is displayed on the terminal matters might have changed, 3.82 Stopping Background Processes ‘You know that a foreground process can be terminated by pressing the DELETE key. But this will naturally not work for a process running in the background. To stop such a process ‘you need to know its pid, which you might have noted while invoking the command or can ‘deduce by examining the listing produced by ps. Then just say ‘% il 1168 and that process will be stopped. Ifa process has created other child processes, you need 10 find out their process-ids using ps and then give them all as arguments to kill. Processes can be killed by passing them various signals with numbers 1 to 15. If nothing is specified then 15 is the dofautt signal number used by kill, So the command above is the same as saying kill -15 1168 ‘you have many processes running and are desperate because you cannot find out the ‘process-ids, you can Kill them all except your login shell by saying % ill 0 ‘Some commands are written such that they can catch signals and actin a predetermined ‘manner on receiving them rather than executing the default action of terminating the process. ‘The login shell is such a command. So % kill 1149 has no effec. Such processes can be killed with signal number 9, which cannot be caught or ‘ignored,But —- kill -9 1149 will kill your login shell itself and you will be back at the login prompt. Check Your Progress 10 1, Can you kil the ps command? 2. Try killing somebody else's process, 3. Can you kill your own fogin shell from another terminal? 3 ‘UNIX Operating Systom-I SS 3.9 SUMMARY ee Jn this unit we have started at the beginning and looked at many basic UNIX commands, However there are many useful commands we have not been able to examine, You must ‘to the manual and learn these, You should now be knowing enough about UNIX to Conduct a session with ease. In the next unit we will examine various types of editors in slight detail. —__CSCTSTCSCSC_C_,_ 3.10 MODEL ANSWERS eo (Check Your Progress 1 1. The contents of the directory, which is the files that are in it and perhaps some that have ‘been init, are seen on the screen, 2. Nomodel answer. 3. The file gots typed slower or faster than usual. At a setting of say 75 baud, the characters appear laboriously on the terminal. Check Your Progress 2 1. The directory heavens has hidden files, that is, files that begin with character. Since it isnot empty, the directory does not get removed, 2. Nomodel answer, 3. No model answer, Check Your Progress 3 1. No model answer, ‘You cannot execute the directory in either case and will get an error message, ‘No. This is because of the definition of search permission, ‘Yes, you only need write petmission on the directory It does not matter whether you can ‘write to the files or not. (Check Your Progress 4 1, No model answer, sep 2. No model answer, 3. No model answer, 4. No model answer, 5. No model answer. Check Your Progress § 1, No modet answer. 2, Nomodel answer, Check Your Progress 6 54 1. Nomodel answer. 2 Nomodel anger [UNIX -Geitng Started IL Cheek Your Progress 7 1, You lose catfile because it gts truncated to zero bytes. The shell tries to create a file called catfile, and because it aeady exists, simply truncates it. Thus, cat has an empty file wo cat and does nothing. ‘Check Your Progress 8 1, file catfile can be created from the terminal by saying cat catfile ‘Now type in the desired contents and terminate the process with an end of file character (ND) ona line by itself. You cannot edit or alter a file directly by this method. You can only enter the text ofthe file all over again. You can also make catfile identical to some other existing file heefile by saying % cat herefile catfile ‘This is like saying % op herefile cattle Check Your Progress 9 1. Nomodet answer. 2. Just say felstwo 3. Nomodel andwer, Check Your Progress 10 1. This is not possible theoretically but very difficult to achieve in practice. By the time ‘you have issued the kill command, the ps would have terminated anyway. 2, You will get an error message saying Permission denied ‘You may kill only your own processes. Only the super user can kill anybody's processes. 3. Yes, this can be'done, 5 UNIT 4 TEXT MANIPULATION Structure 4.0 Invoduction 4.1 Objectives | 4.2. Inspecting Files 421, Fle Satine ‘42.2 Searching For Panems 423 Comparing ies 43° Operating On Files 43: Prong lx 45.2 Rearing Flet 433 Soning Fes 4348 Spning Fes 495 Trasiting hancent 44° Summary 4.5 Model Answers eS 4.0 INTRODUCTION Sc Tn the previous unit you have looked at some of the simple facilites available in UNIX. You hhave looked at some of the features of the system. In particular you have soen how to gain ‘aocess to the gystem and how to sign off, Now itis in the period of time between these two activites that you will usually be doing any useful or interesting work, In this respoct we. ‘have not gone beyond the basics. While most of what you have learnt in the previous unit is essential, itis not very useful by itself. You will not be very productive if all you can do on the compater is make and remove directories or type existing files or change the permission ‘modes of your copics of system files, ‘You will now need to be doing something more useful than that. From its inception, because (of the design goals ofits makers, UNIX has always been rich in documentation, typesetting and text manipulation tools. In this unit we will take a look at some of the most important ‘and common text manipulation utilities available in UNIX. We will not look at typesetting tools like nroff, off, rbl or eqn because they are very complex and not likely to be available ‘on your computer anyway. Here we will concentrate on the simpler text oriented wilities. ce 4.1 OBJECTIVES oo sa In this unit we take a look at the main text processing utilities in UNIX, as well as learn about the editors available, However, we will not look at the rich document preparation {facilities that UNTX has. By the end of this unit, you should be able to : *+ Use we and the grep family of utilities, | * Print text files in formatted fashion with pr * Compare text and binary files with emp, comm and diff + Arrange text files with cut, paste and sort * Split and translate text files using tr a 4.2, INSPECTING FILES eS Let us first look at commands that allow us to inspect files without alieting them. For ‘example, we might want to find out how many words there are in a file, o we might want to locate places in the file which contain a particalar text expression. Before going further, we ‘must be clear as to what a text file is. This isa file which contains only printable characters and which is organised around lines. Although in some cases we can alter the files, these ‘commands are really meant to Jet us look at the flee nr to find about them, 4.2.1 File Statistics pas ‘The we command tells you the number of characters, words and lines in a text file, $ we quotation 8 43 227 quotation ‘This means that quotation has 8 lines, 43 words and 227 charac’ +s. A word isa string of Characters delimited by any combination of one or more spaces, tabs or newlines. If you wish ‘you can make we operate on the standard input, whereupon you will not find any filename displayed in the output. % wo No generalisation is ever wholly true, including this one, ‘The problem with equality is that we desire it only with our superiors. D> 2 202.130 ‘This also means you can use we in a pipe, either to read from or write to. Thus 4% cat quotation | we 8 43 227 or % whol we 9 45 333 In both cases the method used is perhaps not the most natural one. For example, to find out the number of users in a system you could say % who-q ‘You can do a we on several files ata time and then you get an additional line of output giving the total figures. Ifyou wish, you can find only the number of lines in the input by using the 1 option, only the number of words by using - w and only the number of characters by saying -c. These ‘options can be combined in any order. So ‘% we -cl quotation 2278 quotation You can see that % we -lwe is the same as we, ‘Check Your Progress 1 Find another way of counting the number of characters in a file 2. Try wo ona file which is not a text file, or on a directory, 4.2.2 Searching for Patterns ‘We can now come to a few commands which help in locating patterns in files. One such rogram is grep (for global regular expression printer) It takes one regular expression which ‘you want it to search for, and looks for it one by one in all of the specified files. Whenever 7 [UNIX Operating System ‘grep finds a line in a fle that contains the patter, it prints the tine on the standard output. If ‘more than ono file was given to grep to search, the line is preceded by the file name in which’ the match was found, followed by a colon. If only one file was to be searched then only the line is printed. ‘A word on regular expressions is in order, A regular expression is a way of specifying a template or pattern which can match several text strings according to certain rules. For specifying the template, some characters are used with a certain meaning. Such characters are called metacharacters. Thus a dot (.) matches any single character. We will not go into the details ofthe rules governing regular expressions here, because you must have learat about them in your compiler design course, Regular expressions are used there to specify Janguages consisting of legal sentences from an alphabet. From such a specification you ‘ust have leamt how to construct a lexical analyser which accepis only valid sentences, that is, sentences of the language specified by the regular expression. In the present context, our Alphabet is the set of printable characters and the language is the set of all the text strings that ‘match the regular expression. You should refer to your UNIX manual to find out the exact rules for constructing regular expressions for grep, Since the C-shell itsclf attaches a special meaning to many of the metacharacters, you will, ‘need to tell the shell not to interpret the regular expression which you are trying to pass 10 ‘grep, Single quotes are the safest way of telling the shell this. So the regular expression ‘argument to grep should be enclosed in single quotes, although double quotes also do work ‘in many cases. We will examine this matter in the next unit on Shell Programming. ‘Unfortunately the meaning attached to metacharacters in different utilities of UNIX is not always consistent, For example, in grep, as we just saw an arbitrary single character is matched by a period (,) while in the C-shell this is done by the question mark (?). This isa potential source of confusion, and all the more so because a beginner can find it hard to ‘construct or even interpret a regular expression anyway. However, with practice this Aifficalty reduces somewhat, Moreover, not all utilities support regular expressions in their fullest manifestation, and actually the degree of support varies amongst them, By now you will be complaining because you want to see some real examples, not endless ‘commentary on the command. So here we go ‘% grep Gupta payfile {ells you where the string “Gupta” occurs in the file payfile. As shown here grep is matching ‘text string exactly. Every line in payfile that contains the given string anywhere will be printed, You can give more than one fle as an argument. ‘% grep Thomas custileorderfile If you want to know the line number in the fle ofthe fine on which the matches were found, say % grep -n Australia country ‘To count the number of lines which matched, just say % grep -c India prodfile ‘This will not print the fine and only the count will be shown. You can invert the sense of a ‘match like this % grop-v India prodiile ‘This command will print lines that do not include the string India. Remember that grep looks for only one regular expression but can look at more than one file, So do not try % grep Ram Kumar users ‘grep: can’t open Kumar {0 look for Ram Kumar ina file users. The command as shown will look for a string Ram in the two files called Kumar and users, Instead you should say ‘% grep “Ram Kumar” users whereupon Ram Kumar will be searched for in the ile users, There is also an option to turn ‘Text Manipulation off case ity. So % grep -i “Ram Kumar” users Will find any cocurrence of Ram Kumar irrespective of case, Thus this would report RAM. KuMar as a match. What if there could be occurrences of the string in the file with an unknown number of spaces between the two words? You will now need to use regular expressions. % grep “Ram *Kumar” users ‘matches Ram Kumar in this case. The * metacharacter specifies a closure meaning thatthe Preceding pattern is to be matched 0 or more times, which is what we want here, Grep is line oriented and patterns are not matched across line boundaries. The metacharacters ‘and $ stand for the beginning and the end of a line respectively, So 10 Jook for an empty line, say % grep "*$" users But if you are looking for blank ines, say = % grep’ ATS" users ‘The [ 1] isthe character class consisting of spaces and tabs, and the * metacharacter is a ‘closure which looks for 0 or more occurrences of these. ‘To see whether Khanz is a valid login néme, say % grep“Nchana” [etc/passwd ‘because the login name isthe firs field in the passwd file. You can get every line ina file with line numbering by saying % grep-n letter ‘This is like a cat on the file but with the line numbers displayed too. To find lines containing ‘anumber, say % grep "[0-9]' table ‘Which will ind a sequence of one or more digits. ‘We have seen that grep cannot search for more than one regular expression ata time, There is ‘another utility called egrep which can handle regular expressions with alternations. We will not look at it here but you should study the manual entry fori. ‘There is another utility in his family called fgrep which does not handle regular expressions, Since it handles only fixed text strings, however, itis faster. Thus you can say % fgrep “Ram Kumar” empfile custfile* Another advantage to this command is that you can store alist of words in a file, say search, ‘one word per line. You can thea look for the occurrence of any of those words in a file like this \ % fgrep -f search story ‘Usually grep is sufficient for everyday use but whenever needed you can make use of fgrep or egrep. Check Your Progress 2 1. Find all lines in a file with words longer than 8 letters, assuming that words are ‘Separated by spaces except at the beginning or end of a line, eS 2. Find all lines which have both of two specified regular expressions. For example, you ‘want to look for Ram as well as Kumar without regard to their relative positions ia a line, 3. Find all tines with exactly 10 characters in them, 4. Find out from how many places a given user is logged in, 5. The fields of /tc/pasewd are separated by colons, and the user-id isthe second field, Find out bow many super used accounts there ae in the installation, ‘We will now look at a group of uiiities which help us compare two files. While talking of ep ‘in #2.4.8, we did not know of commands which could help us ascertain whether the original and the copy indeed had the same contents. First let us make a copy of the passwd file in our

Você também pode gostar