Você está na página 1de 145

Introduction to COBOL

Introduction

Aims To provide a brief introduction to the programming language COBOL. To provide a


context in which its uses might be understood. To introduce the Metalanguage used
to describe syntactic elements of the language. To provide an introduction to the
major structures present in a COBOL program.

Objectives By the end of this unit you should -

1. Know what the acronym COBOL stands for.

2. Be aware of the significance of COBOL in the marketplace.

3. Understand some of the reasons for COBOL's success.


Basic Procedure Division Commands

Introduction

Aims The PROCEDURE DIVISION contains the code used to manipulate the data described in
the DATA DIVISION. This tutorial examines some of the basic COBOL commands used
in the PROCEDURE DIVISION.

In the course of this tutorial you will examine how assignment is done in COBOL,
how the date and time can be obtained from the computer and how arithmetic
statements are written.

Objectives By the end of this unit you should -

1. Know how to get data from the keyboard and write it to the screen.

2. Know how to get the system date and time.

3. Understand how the MOVE is used for assignment in COBOL.

4. Understand how alphanumeric and numeric moves work.


COBOL Selection Constructs

Introduction

Aims In most procedural languages, If and Case/Switch are the only selection constructs
supported. COBOL supports advanced versions of both of these constructs, but it
also introduces the concept of Condition Names - a kind of abstract condition.

In this tutorial we will examine COBOL's selection constructs, the IF and the
EVALUATE and we will demonstrate how to create and use Condition Names.

Objectives By the end of this unit you should -

1. Understand how an IF statement works.

2. Know the types of condition that COBOL supports and understand how
and when to use them.

3. Know the condition precedence rules and be able to create complex


conditions using AND and OR.
4. Know how to use Implied subjects.

5. Be able to create nested IF statements.

6. Understand how Condition Names work and be able to create and use
them.

7. Be able to use the SET verb to set a Condition Name to true.

8. Know how to use a Condition Name to signal the end of a Sequential File.

9. Understand how the EVALUATE works.

Prerequisites Introduction to COBOL

Declaring data in COBOL

Basic Procedure Division Commands

Selection using IF

Introduction When a program runs the program statements are executed one after another in
sequence unless a statement is encountered that alters the order of execution.

An IF statement is one of the statement types that can alter the order of execution
in the program.

An IF statement allows the programmer to specify that the block of code is to be


executed only if the condition attached to the IF statement is satisfied.
IF syntax

When an IF statement is encountered in a program, the StatementBlock following


the THEN is executed, if the condition is true, and the StatementBlock following
the ELSE (is used) is executed, if the condition is false.

The StatementBlock(s) can include any valid COBOL statement including further
IF constructs, PERFORMs, etc.

Using the END-IF Although the scope of the IF statement may be delimited by a full-stop (the old
delimiter way), or by the END-IF (the new way), the END-IF delimiter should always be
used.

The END-IF makes explicit the scope of the statement. Using a full stop to delimit
the scope of the IF can lead to problems. For instance, the two IF statements
below are supposed to perform the same task. But the scope of the one on the left
is delimited by the END-IF, while that on the right is delimited by a full stop.

Statement1 Statement1
Statement2 Statement2
IF VarA > VarD THEN IF VarA > VarD THEN
Statement3 Statement3
Statement4 Statement4
END-IF Statement5
Statement5 Statement6.
Statement6.

Unfortunately, in the IF on the right, the programmer has forgotten to follow


Statement4 by a delimiting full stop. This means that Statement5 and 6 will be
included in the scope of the IF (i.e. will only be executed if the condition is true)
by mistake. If you use full stops to delimit the scope of an IF statement, this is an
easy mistake to make and, once made, it is difficult to spot. A full stop is small
and unobtrusive compared to an END-IF.

Condition Types The IF statement is not as simple as the syntax diagram above seems to suggest.
The condition following the IF, is drawn from one of the condition types shown in
the table below.

Condition Types

Simple Conditions

o Relation Conditions

o Class Conditions

o Sign Conditions

Complex Conditions

Condition Name

Simple and Complex conditions are examined in this section, but Condition
Names are so important that they are covered separately in the next section.

Relation Conditions

The syntax of a Relation Condition is shown above. As you can see from the
diagram a Relation Condition may be used to test whether a value is less than,
equal to, or greater than, another value.
In the comparison we can use the full words or the symbols shown. Note
however, that there is no symbol for NOT; you must use the word if you want to
express this condition.

When a condition is evaluated, it evaluates to either True or False. It does not


evaluate to 1 or 0.

Note that the values of the compared items must be type compatible. For instance,
it is not valid to have a statement that says

IF "mike" IS EQUAL TO 123 THEN etc

Class Conditions

Although this course will not


cover the SPECIAL-NAMES
paragraph in detail it is useful
to acquaint yourself with its Although COBOL data-items are not "strongly typed", they do fall into some
clauses. broad categories or classes, such as numeric or alphanumeric. A Class Condition
may be used to determine whether the value of data-item is a member of one
As well as setting up a class these classes. For instance, a NUMERIC Class Condition might be used on an
name, the SPECIAL-NAMES
paragraph allows you to do alphanumeric (PIC X) or a numeric (PIC 9) data-item to see if it contained numeric
such things as; data.
Specify the collating sequence
- e.g. ASCII or EBCDIC The UserDefinedClassName is name that a programmer can assign to a set of
Specify the currency sign
characters. The programmer must use the CLASS clause of the SPECIAL-NAMES
Create User Defined
Figurative constants. paragraph, of the CONFIGURATION SECTION, in the ENVIRONMENT DIVISION, to
assign a class name to a set of characters.

Rules
The target of a class test must be a data-item whose usage is explicitly or
implicitly, DISPLAY. In the case of numeric tests, data items with a usage of
PACKED-DECIMAL may also be tested.

The numeric test may not be used with data items described as alphabetic (PIC A)
or with group items when any of the elementary items specifies a sign.

An alphabetic test may not be used with any data items described as numeric (PIC
9).
An data-item conforms to the UserDefinedClassName if its contents consist
entirely of the characters listed in the definition of the UserDefinedClassName.

Example

* Uses the UPPER Intrinsic Function to convert to uppercase


IF InputChar IS ALPHABETIC-LOWER
MOVE FUNCTION UPPER (InputChar) TO InputChar
END-IF

Sign Condition

The Sign Condition determines whether or not the value of an arithmetic


expression is less than, greater than, or equal to zero. Sign Conditions are shorter
way of writing certain Relational Conditions.

Complex Conditions

Complex Conditions are formed by combining two or more simple conditions


using the conjunction operators OR or AND.

Like other conditions in COBOL, a complex condition evaluates to either True or


False. A complex condition is an expression and like arithmetic expressions it is
evaluated from left to right unless the order of evaluation is changed by the
precedence rules (shown below) or by bracketing.

Precedence Condition Arithmetic


Value Equivalent

1. NOT **
2. AND * or /

3. OR + or -
You can see from the IF
Amnt1 example, that figuring
out how a complex condition
will be evaluated is not Example
straightfoward.
Always use brackets to make IF Row > 0 AND Row < 26 THEN
the order of evaluation DISPLAY "On Screen"
explicit. END-IF

IF NOT Amnt1 < 10 OR Amnt2 = 50 AND Amnt3 > 150 THEN


DISPLAY "Done"
END-IF

The effect of bracketing


Let's examine the last example above to see what difference bracketing makes to
the order of evaluation.

First consider how the statement would be bracketed to make explicit what is
actually happening.

The NOT takes precedence so we must write - NOT (Amnt1 <10).


The AND is next in precedence so we must bracket -
((Amnt2=50) AND (Amnt3 > 150)).
Finally the OR is evaluated to give the full condition as -
(NOT (Amnt1 < 10)) OR ((Amnt2 = 50) AND (Amnt3 > 150))

If all the simple conditions are true; will the Complex Condition be true? Let's
check.

Condition (NOT (Amnt1 < 10)) OR ((Amnt2 = 50) AND (Amnt3 > 150))

Expressed as (NOT (T)) OR ((T) AND (T))

Evaluates to (F) OR (T)


Evaluates to True

Consider the condition in the table below and ask the same question. Is the
Complex Condition true if all the Simple Conditions are true?
Condition NOT ((Amnt1 < 10) OR (Amnt2 = 50)) AND (Amnt3 > 150)

Expressed as NOT ((T) OR (T)) AND (T)

Evaluates to NOT (T) AND (T)


Evaluates to (F) AND (T)
Evaluates to False

Now, consider the condition below and create a table similar to the ones above. Is
this Complex Condition true if all the Simple Conditions are true?

IF NOT (((Amnt1 < 10) OR (Amnt2 = 50)) AND (Amnt3 > 150)) THEN

Implied Subjects Although COBOL is often verbose, it does occasionally provide constructs that
enable quite succinct statements to be written. Implied Subjects is one of these
constructs.

You can use Implied Subjects when you are making a number of comparisons
against a single data-item.

For instance, in the first example in Complex Conditions above, we check to see
if Row is greater than 0 and Row is less than 26. We could rewrite this statement
using Implied Subjects as-

IF Row > 0 AND < 26 THEN etc

the Implied Subject here is Row.

Examples
In these examples the full condition is shown first and is followed by the
condition using Implied Subjects

IF TotalAmt > 10000 AND TotalAmt < 50000 THEN etc.


IF TotalAmt > 10000 AND < 50000 THEN etc
* The Implied Subject is - TotalAmt
IF Grade = "A" OR Grade = "B1" OR GRADE = "B2" OR GRADE = "B3"
IF Grade = "A" OR "B1" OR "B2" OR "B3"
*The Implied Subject is - Grade = "A"

IF VarA > VarB AND VarA > VarC AND VarA > VarD
DISPLAY "VarA is the Greatest" <
END-IF
IF VarA > VarB AND VarC AND VarD
DISPLAY "VarA is the Greatest" <
END-IF
* The Implied Subject is - VarA >

Nested Ifs COBOL allows nested IF statements.

For example:

IF ( VarA < 10 ) AND ( VarB NOT > VarC ) THEN


IF VarG = 14 THEN
DISPLAY "First"
ELSE
DISPLAY "Second"
END-IF
ELSE DISPLAY "Third"
END-IF

The table below contains representations of the variables used in the nested Ifs
above. In each instance, see if you can figure out which of the messages will be
displayed. To see the answer, move your cursor over the text "Answer" and the
correct answer should be shown.

VarA VarB VarC VarG DISPLAY


3 4 15 14

3 4 15 15
3 4 3 14
13 4 15 14
Condition Names

Introduction Wherever a condition can occur, such as in an IF statement or an EVALUATE or a


PERFORM..UNTIL, a Condition Name (Level 88) may be used.

Condition Names are defined in the DATA DIVISION using the special level
number 88. They are always associated with a data-item and are defined
immediately after the definition of the data-item.

A Condition Name is a name given to a specified subset of the values which its
associated data-item can hold.

Like a condition, a Condition Name evaluates to True or False.

Condition Name
Syntax

When used with Condition Names, the VALUE clause does not assign a value. It
merely identifies the value(s) in the data-item which make the Condition Name
True.

When identifying the condition values, you can specify a single value, a list of
values, a range of values, or any combination of these.

To specify a list of values, simply list the values after the keyword VALUE. The
COBOL Iteration Constructs

Introduction

Aims In almost every programming job, there is some task that needs to be done over
and over again. For example: The job of processing a file of records is an iteration
of the task - get and process record. The job of getting the sum of a stream of
numbers is an iteration of the task - get and add number. These jobs are
accomplished using iteration constructs.

Other computer languages support a variety of looping constructs, including


Repeat, While, and For loops. Although COBOL has a set of looping constructs
that is just as rich as other languages - richer in some cases - it only has one
iteration verb. In COBOL, all iteration is handled by the PERFORM verb.

IterationconstructsandtheirCOBOLequivalents

C Modula-2 COBOL

do{}while
Repeat Perform Until ..With Test After

while
While Perform Until ..With Test Before
for
For Perform ..Varying

This tutorial demonstrates how the PERFORM verb is used to create Repeat loops,
While loops and For loops. It will also demonstrate how the PERFORM is used to
transfer control to an open subroutine.

Objectives By the end of this unit you should -

1. Understand how the PERFORM can be used to transfer control to block of


code contained in a paragraph or section..

2. Know how to use the PERFORM..THRU and the GO TO and understand


the restrictions placed on using them.

3. Understand the difference between in-line and out-of-line Performs

4. Be able to use the PERFORM..TIMES.

5. Understand how the PERFORM..UNTIL works and be able to use it to


implement while or do/repeat loops.

6. Be able to use PERFORM..VARYING to implement counting iteration


such as that implemented in other languages by the for construct.

7. Understand the significance of the order of execution in the


PERFORM..VARYING flowchart

Prerequisites Introduction to COBOL

Declaring data in COBOL

Basic Procedure Division Commands

Selection Constructs
PERFORM..Proc

Introduction If you have written programs in another language, you will probably have come
across the idea of a subroutine; a block of code that is executed, when invoked by
name. What you may not have realized is that there are essentially two types of
subroutine.:

Open Subroutines
and
Closed Subroutines

If the language you learned was C or Modula-2, you are probably familiar with
closed subroutines. If you learned BASIC, you may be familiar with open
subroutines.

Open subroutines.
An open subroutine., is a named block of code that control can fall into, or
through. An open subroutine., usually has access to all the data-items declared in
the main program and it can't declare any data-items of its own.

Although an open subroutine. is normally executed by invoking it by name, it is


also possible, unless the programmer is careful, to fall into it from the main
program. In BASIC, the GOSUB command allows programmers to implement
open subroutines.

Closed subroutines.
A closed subroutine., is a named block of code that can only be executed by
invoking it by name. Usually a closed subroutine. can declare its own local data
which cannot be accessed outside the subroutine. In a closed subroutine., data is
usually passed between the main program and the subroutine. by means of
parameters passed to the subroutine. when it is invoked.
In C and Modula-2, Procedures and Functions implement closed subroutines.

COBOL subroutines.
COBOL supports both open and closed subroutines. Open subroutines, are
implemented using the first format of the PERFORM verb. Closed subroutines., are
implemented using the CALL verb and contained or external subprograms.

PERFORM format 1
syntax

Unless it is otherwise instructed, a computer running a COBOL program


processes the statements of the program in sequence, starting at the top of the
program and working its way down until the STOP RUN is reached. The PERFORM
A paragraph is a block of code verb is is one way of altering the sequential flow of control in a COBOL program.
that starts at the paragraph The PERFORM verb can be used for two major purposes;
name and extends to the next
paragraph name, the next 1. To transfer control to a designated block of code.
section name or the end of the
program text.
2. To execute a block of code iteratively.

A section is a block of code While the other formats of the PERFORM verb implement various types of
that starts at the section name iteration, the format shown here is used to transfer control to an out-of-line block
and extends to the next section of code.
name or the end of the
program text. A section must
contain at least one paragraph.
The block of code may be one or more paragraphs, or one or more sections.

How this format This format of the PERFORM verb, transfers control an out-of-line block of code.
works When the end of the block is reached, control reverts to the statement (not the
sentence) immediately following the PERFORM.

1stProc and EndProc are the names of paragraphs or sections.


When the PERFORM..THRU is used, the paragraphs or sections from 1stProc to
EndProc are treated a single block of code. COBOL programmers typically use
this format of the PERFORM to divide a program into open subroutines.

The PERFORM..THRU should be These subroutines are not as robust as the user-defined Procedures or Functions
used sparingly and then only found in other languages, but when COBOL programmers require that kind of
to PERFORM a paragraph and its partitioning, they use contained or external subprograms.
immediately succeeding
paragraph.
Open subroutines. are useful because they allow a programmer to code a
The problem with using the
subroutine. without the formality or overhead involved in coding a Procedure or
PERFORM..THRU to execute an Function.
number of paragraphs as one
unit is that, in the maintenance PERFORM general notes.
phase of your program's life,
another programmer may
PERFORMs may be nested.
insert a paragraph in the
middle of your PERFORM..THRU That is, a PERFORM may execute a paragraph that contains a PERFORM which in
block. Suddenly your block turn may execute a paragraph that contains another PERFORM. As control reaches
won't work correctly because the end of each paragraph it returns to the statement following the perform which
now its executing an cause the paragraph to be executed.
additional, unintentional
paragraph.
Order of execution independent of physical placement
The order of execution of the paragraphs is independent of their physical
placement. So it doesn't matter where in the Procedure Division we put our
paragraphs the PERFORM will find and execute them

Recursion not allowed.


Although Performs can be nested, neither direct nor indirect recursion is allowed.
This means that a paragraph must not contain a PERFORM that invokes itself or
any ancestor paragraph (parent, grandparent etc). Unfortunately this restriction is
not enforced by the compiler but your program will not work correctly if you use
Although Netexpress does recursive Performs
allow recursive Performs, this
is a nonstandard extension.
Please do not take advantage
of it..

Why use this format? This format of the PERFORM verb is used to make programs more readable and
maintainable.

When we can identify a block of code in the program that performs some specific
task (e.g. Prints the report headings) this format allows us to replace the details of
how the task is being accomplished with a name that indicates what is being done
(e.g. PERFORM PrintReportHeadings).

We should use this format of the PERFORM to divide our programs into a
hierarchy of tasks and sub-tasks.

$ SET SOURCEFORMAT"FREE"
PERFORM..PROC IDENTIFICATION DIVISION.
example PROGRAM-ID. PerformFormat1.
AUTHOR. Michael Coughlan.
* An example program using the Perform verb.

PROCEDURE DIVISION.
TopLevel.
DISPLAY "In TopLevel. Starting to run program"
PERFORM OneLevelDown
DISPLAY "Back in TopLevel.".
STOP RUN.

TwoLevelsDown.
DISPLAY ">>>>>>>> Now in TwoLevelsDown."
PERFORM ThreeLevelsDown.
DISPLAY ">>>>>>>> Back in TwoLevelsDown.".

OneLevelDown.
DISPLAY ">>>> Now in OneLevelDown"
PERFORM TwoLevelsDown
DISPLAY ">>>> Back in OneLevelDown".

ThreeLevelsDown.
DISPLAY ">>>>>>>>>>>> Now in ThreeLevelsDown".

Self Assessment Now that you understand how this version of the PERFORM works, test your
Questions understanding by referring to the example program above and answering the
following questions.

Write out what the example program above will


Q1
display on the screen.
Taking your pen in hand once more, write out what the
Q2
program will display if the STOP RUN is missing.
Is it valid to insert the statement PERFORM
Q3
ThreeLevelsDown into the paragraph ThreeLevelsDown?
Is it valid to have the statement PERFORM TwoLevelsDown
Q4
in the paragraph ThreeLevelsDown?

Using the PERFORM..THRU

Introduction Although the PERFORM..THRU has dangers, as outlined above, it can


be a useful construct for dealing with errors. Sometimes we need to
stop executing a paragraph if an error is detected. The
PERFORM..THRU provides a mechanism which allows us to do this.

Coping with errors In the program fragment below, the programmer does not want to
execute the remaining statements in the paragraph if an error is
detected. The solution he has adopted, based on nested IF statements,
is somewhat cumbersome.

PROCEDURE DIVISION.
Begin.
PERFORM SumSales
STOP RUN.

SumSales.
Statements
Statements
IF NoErrorFound
Statements
Statements
IF NoErrorFound
Statements
Statements
Statements
IF NoErrorFound
Statements
Statements
Statements
Statements
END-IF
END-IF
END-IF.

Using the PERFORM..THRU In the program fragment below, the PERFORM..THRU is used to deal
with detected errors in a more elegant manner.

When the statement PERFORM SumSales THRU SumSalesExit is


executed, both paragraphs will be performed as if they were one
paragraph. The GO TO jumps to the exit paragraph which, because
the paragraphs are treated as one, is the end of the block of code.
Actually this approach will soon be This technique allows the programmer to skip over the code he does
unnecessary on the way out, because the not want executed if an error is detected.
coming COBOL standard solves the problem
by having an EXIT PARAGRAPH statement, that
can be used to exit a paragraph prematurely. The EXIT statement in the SumSalesExit paragraph is a dummy
statement. It has absolutely no effect on the flow of control. It is in
the paragraph merely to conform to the rule that every paragraph
must contain at least one sentence and in fact it must be the only
sentence in the paragraph. It may be regarded as a comment.

PROCEDURE DIVISION
Begin.
PERFORM SumSales THRU
SumSalesExit
STOP RUN.

SumSales.
Statements
Statements
IF ErrorFound GO TO
SumSalesExit
END-IF
Statements
Statements
IF ErrorFound GO TO
SumSalesExit
END-IF
Statements
Statements
Statements
IF ErrorFound GO TO
SumSalesExit
END-IF
Statements
Statements
Statements
Statements

SumSalesExit.
EXIT.

PERFORM..THRU restrictions The PERFORM..THRU and GO TO are dangerous constructs which, if


used unwisely, will make your programs very difficult to read,
understand and maintain. Because of this, the PERFORM..THRU
should only be used to set up a paragraph exit as in the example
above and it should only cover two paragraphs. No other use of the
PERFORM..THRU is acceptable.

This is also the only time the GO TO should be used.

PERFORM..TIMES

Introduction The PERFORM..TIMES format has no real equivalent in most


programming languages. This format allows a block of code to be
executed a specified number of times.

PERFORM..TIMES Syntax

This format of the PERFORM executes a block of code


RepeatCount number of times before returning control to the
statement following the PERFORM.

Like the remaining formats of the PERFORM, this format allow two
types of execution.

Out-of-line execution of a block of code

In-line execution of a block of code.

In-line vs Out-of-line In-line execution


In-line execution will be familiar to programmers who have used the
iteration constructs (while,do/repeat, for) of most other
programming languages. In an in-line PERFORM, the block of code to
be iteratively executed is contained within the same paragraph as the
PERFORM. That is, the loop body is in-line with the rest of the
paragraph code.

The block of code to be executed starts at the keyword PERFORM


and ends at the keyword END-PERFORM (see example program
below).

Out-of-line execution
In an out-of-line PERFORM the loop body is a separate paragraph
or section. It is the equivalent of having a Procedure or Function call
inside the loop body of a while or for construct.

Some guidelines
In general, where a loop is needed but only a few statements are
involved, an in-line PERFORM should be used.

Where out-of-line code is executed by a format 1 PERFORM, the


code should perform some specific function and that function should
be identified by the paragraph name chosen.

Where an out-of-line paragraph consists of 5 statements or less,


there should be a good reason for placing these statements in a
separate paragraph.

Programmers should try to achieve a balance between in-line and


out-of-line code. The program should not be too fragmented, nor too
monolithic.

$ SET
Example SOURCEFORMAT"FREE"
IDENTIFICATION DIVISION.
PROGRAM-ID.
InLineVsOutOfLine.
AUTHOR. Michael
Coughlan.
* An example program
demonstrating
* in-line and out-of-line
Performs.

DATA DIVISION.
WORKING-STORAGE SECTION.
01 NumOfTimes
PIC 9 VALUE 5.

PROCEDURE DIVISION.
Begin.
DISPLAY "Starting to
run program"
PERFORM 3 TIMES
DISPLAY ">>>>This
is an in line Perform"
END-PERFORM
DISPLAY "Finished in
line Perform"
PERFORM OutOfLineEG
NumOfTimes TIMES
DISPLAY "Back in
Begin. About to Stop".
STOP RUN.

OutOfLineEG.
DISPLAY ">>>> This is
an out of line Perform".

Self Assessment Questions Examine the program above and write out what it will display on the
screen.

PERFORM..UNTIL

Introduction This format of the PERFORM is used where the while and do/repeat constructs are
used in other languages. It causes a block of code to be iteratively executed until
some terminating condition is reached.
PERFORM..UNTIL
syntax

Notes
If the WITH TEST BEFORE phrase is used, the PERFORM behaves like a while loop
and the condition is tested before the loop body is entered.

The WITH TEST AFTER phrase causes the PERFORM to act do/repeat loop and the
condition is tested after the loop body is entered.

The WITH TEST BEFORE phrase is the default and so is rarely explicitly stated.

How the The flowcharts below show how the PERFORM..UNTIL works. As you can see the
PERFORM..UNTIL terminating condition is only checked at the beginning of each iteration
works (PERFORM WITH TEST BEFORE) or at the end of each iteration (PERFORM WITH
TEST AFTER).

The terminating condition is only checked at the beginning of each iteration


(PERFORM WITH TEST BEFORE) or at the end of each iteration (PERFORM WITH
TEST AFTER). If the terminating condition is reached in the middle of the
iteration, the rest of the loop body will still be executed; although the terminating
condition has been reached, it cannot be checked until the current iteration has
finished.

Although the PERFORM WITH TEST BEFORE is often said to be equivalent to a


while loop, this is not entirely true. In a while loop, the condition is tested to see
whether the iteration should continue (for example, while(Letter != 's') ) but in a
PERFORM, the condition is tested to see if the iteration should stop (For example,
PERFORM WITH TEST BEFORE UNTIL Letter = "s") .
TEST BEFORE Vs Beginning programmers often ask; when should they use the WITH TEST BEFORE
TEST AFTER loop, and when should they use the WITH TEST AFTER.

There really isn't a cookbook answer to this. It's a matter of experience. But we
can identify some circumstances, when it is better to use the WITH TEST BEFORE,
than the WITH TEST AFTER.

When you need to process a stream of data items, and don't know the size of the
stream, and can't detect the end of the stream until you attempt to retrieve the
next item, then a test before loop, is the best construct to use.

If the end of the stream can be detected when the last item is retrieved, then the
appropriate construct is probably a test after loop.
The "read ahead" Processing a stream of data items of undetermined length, is a common operation
technique in COBOL, because sequential files fall into this category. A useful strategy
known as the read ahead has been developed for processing sequential files.

The central idea of the "read ahead" is that, because the end of the file cannot be
detected until an attempt is made to read a record, the Read must be positioned as
the last statement in the record processing loop.

You can see how this works in the processing template below. With the read
ahead strategy we always try to stay one data item ahead of the processing. So
the Read outside the loop, reads the first record and this record is processed
inside the loop. The Read inside the loop, reads the next, and all the succeeding
records. When the inside Read detects the end of file, it sets a Condition Name
that immediately causes the loop to halt.

Processing Template
READ StudentRecords
AT END SET EndOfStudentFile TO TRUE
END-READ
PERFORM WITH TEST BEFORE UNTIL EndOfStudentFile
record processing
record processing
record processing
READ StudentRecords
AT END SET EndOfStudentFile TO TRUE
END-READ
END-PERFORM

This approach to processing a sequential file has two main advantages.

1. Because the read outside the loop reads the first record, the loop is never
entered if the file is empty.

2. Because the Read is the last statement in the loop, the loop can be halted
as soon as the end of file is detected.
PERFORM..UNTIL The primary concern of a programmer who creates a loop should be - will the
comment loop terminate. Much of the work of proofs of program correctness goes into
proving that the loops in a program are going to terminate. It seems curious then,
that in most programming languages the loop condition concentrates on, not
whether the loop will end, but on whether the loop will keep going. COBOL is
one of the few languages that gets this right. In COBOL, the loop body is
executed until the terminating condition is reached.

PERFORM..VARYING

Introduction The PERFORM..VARYING is used to implement counting iteration. It is similar to


the For construct in languages like Modula-2, Pascal and C. However, these
languages permit only one counting variable per loop instruction, while COBOL
allows up to three.

Why three? Earlier versions of COBOL only allowed tables with a maximum of
three dimensions, and the PERFORM..VARYING was a mechanism for processing
them.
PERFORM..VARYING
syntax

Notes
The AFTER phrase cannot be used in an in-line PERFORM. This means that only
one counter may be used with an in-line PERFORM.

The item after the VARYING phrase is the most significant counter, the counter
following the first AFTER phrase is the next most significant, and the last counter
is the least significant.

The least significant counter must go though all its values and reach its
terminating condition before the next most significant counter can be
incremented.

The item after the FROM, is the starting value of the counter.

The item after the BY, is the step value of the counter. This can be negative or
positive. If a negative step value is used, the counter should be signed (PIC S99,
etc.).

When the iteration ends, the counters retain their terminating values.

As before, when no WITH TEST phrase is used, the WITH TEST BEFORE is
assumed.

Though the condition would normally involve some evaluation of the counter, it
is not mandatory. For instance, the statement that follows is perfectly valid:

PERFORM CountRecords
VARYING RecCount FROM 1 BY 1 UNTIL EndOfFile
PERFORM..VARYING The example animation below demonstrates how a simple PERFORM..VARYING,
using one counter using only one counter, work. Pay particular attention to when the counter is
Example incremented. In the example note that the condition Idx1 = 3 results in only two
passes through the loop body.
PERFORM..VARYING This example animation demonstrates how a PERFORM..VARYING, with two
using two counter counters, works.
Example
Note how the counter Idx2 must go through all its values and reach its
terminating value before the Idx1 counter is incremented. An easy way to think
about this is to think of it as a mileage counter. In a mileage counter, the units
counter must go through all its values 0-9 before the tens counter is incremented,
and the tens counter must go through all its values before the hundreds counter is
increment.

Note that the first counter mentioned in the PERFORM is the most significant
and the next is the next most significant etc.

PERFORM..VARYING In the example program simulates the mileage counter mentioned above.
Example Program Examine this program an then attempt to answer the Self Assessment Question
which follow.

$ SET SOURCEFORMAT"FREE"
IDENTIFICATION DIVISION.
PROGRAM-ID. MileageCounter.
AUTHOR. Michael Coughlan.
* Simulates a mileage counter

DATA DIVISION.
WORKING-STORAGE SECTION.
01 Counters.
02 HundredsCnt PIC 99 VALUE ZEROS.
02 TensCnt PIC 99 VALUE ZEROS.
02 UnitsCnt PIC 99 VALUE ZEROS.

01 DisplayItems.
02 PrnHunds PIC 9.
02 PrnTens PIC 9.
02 PrnUnits PIC 9.

PROCEDURE DIVISION.
Begin.
DISPLAY "Using an out-of-line Perform".
DISPLAY "About to start mileage counter simulation".
PERFORM CountMileage
VARYING HundredsCnt FROM 0 BY 1 UNTIL HundredsCnt > 9
AFTER TensCnt FROM 0 BY 1 UNTIL TensCnt > 9
AFTER UnitsCnt FROM 0 BY 1 UNTIL UnitsCnt > 9
DISPLAY "End of mileage counter simulation."

DISPLAY "Now using in-line Performs"


DISPLAY "About to start mileage counter simulation".
PERFORM VARYING HundredsCnt FROM 0 BY 1 UNTIL
HundredsCnt > 9
PERFORM VARYING TensCnt FROM 0 BY 1 UNTIL TensCnt > 9
PERFORM VARYING UnitsCnt FROM 0 BY 1 UNTIL UnitsCnt
> 9
MOVE HundredsCnt TO PrnHunds
MOVE TensCnt TO PrnTens
MOVE UnitsCnt TO PrnUnits
DISPLAY PrnHunds "-" PrnTens "-" PrnUnits
END-PERFORM
END-PERFORM
END-PERFORM.
DISPLAY "End of mileage counter simulation."
STOP RUN.

CountMileage.
MOVE HundredsCnt TO PrnHunds
MOVE TensCnt TO PrnTens
MOVE UnitsCnt TO PrnUnits
DISPLAY PrnHunds "-" PrnTens "-" PrnUnits.

Why is > 9 used in all the terminating


Q1
conditions?.

Why are the counters all declared as PIC 99. Surely


Q2
the size of each counter should only be one digit?
In the in-line PERFORM why are there three separate
Q3
Performs?
Introduction to Sequential Files

Introduction

Aims Files are repositories of data that reside on backing storage (hard disk, magnetic
tape or CD-ROM). Nowadays, files are used to store a variety of different types of
information, such as programs, documents, spreadsheets, videos, sounds, pictures
and record-based data.

Although COBOL can be used to process these other kinds of data file, it is
generally used only to process record-based files.

In this, and subsequent file-oriented tutorials, we examine how COBOL may be


used to process record-based files.

There are essentially two types of record-based file organization:

1. Serial Files (COBOL calls these Sequential Files)

2. Direct Access Files.

In a Serial File, the records are organized and accessed serially.

In a Direct Access File, the records are organized in a manner that allows direct
access to a particular record without having the read any of the preceding records.

In this tutorial, you will discover how COBOL may be used to process serial files.
Processing Sequential Files

Introduction

Aims The records in a Sequential file are organized serially, one after another, but the
records in the file may be ordered or unordered. The serial organization of the file
and whether the file is ordered or unordered has a significant baring on how we
process the records in the file and what kind of processing we can do.

This tutorial examines the consequences of ordering and organization and reviews
some techniques for processing Sequential files.

Objectives By the end of this unit you should -

1. Understand how Sequential files are organized.

2. Understand the processing limitations imposed by an unordered


Sequential file

3. Be able to add, delete and update records in an ordered Sequential file.

Prerequisites Introduction to COBOL

Declaring data in COBOL

Basic Procedure Division Commands


Why do you think the complex condition EndTF AND
Q2
EndMF has been used to terminate the loop?
There does not seem to be any special code to write out the
Q3 remaining records when one file ends before the other. Can
the code above be correct?
Deleting records from an ordered Sequential The animation below demonstrates how to delete
file. records from an ordered Sequential file.
Updating records in an ordered Sequential Updating a record requires a change to one or more of
file. the fields in the record. Updates to Sequential files
are usually collected together and applied to the file
in one go - a batch. To update the records in an
ordered Sequential file from an ordered batch of
transaction records we have to create a new file that
contains the updated records.

The animation below demonstrates how to batch-


update records in an ordered Sequential files.

Multiple record type transaction files In all the examples in this tutorial the transaction file
has contained only records of one type or another. For
instance, the transaction file contained either a batch
of deletions, or a batch of insertions or a batch of
updates. In real life though all these different kinds of
transaction record would be gather together into one
transaction file.

This raises some interesting problems. For one thing -


the record sizes will be different. A deletion record
only needs the key field, while an insertion requires
the whole record, and an update may be somewhere
between these two depending on how many fields we
are updating in one go. There may even be different
kinds of update record.

In future tutorials we will see how we can define and


process files that contain multiple record types.
Edited Pictures

Introduction

Aims In a business-programming environment, the ability to print reports is an


important property for a programming language. COBOL allows programmers to
write to the printer, either directly or through an intermediate print file.

But there would be little point in being able to write to the printer, if the output
could not be formatted properly. COBOL allows sophisticated formatting of
output through its Edited Picture clauses.

This tutorial introduces the additional symbols required for edited pictures and
shows how they may be used to format data for output to screen or printer.

Objectives By the end of this unit you should -

1. Know what an Edited Picture is.

2. Know and be able to use the different kinds of Edited Picture.


PIC 9(6) 000078 PIC 9(3),9(3)
PIC 9(6) 000078 PIC ZZZ,ZZZ
PIC 9(6) 000178 PIC ***,***
PIC 9(6) 002178 PIC ***,***
PIC 9(6) 120183 PIC 99B99B99
PIC 9(6) 120183 PIC 99/99/99
PIC 9(6) 031245 PIC 990099

Special Insertion The decimal point is the only Special Insertion symbol. A
decimal point is inserted in the character position where
the symbol occurs.

Notes
When a numeric data-item is moved into an edited data-
item containing the decimal point symbol, alignment
occurs along the position of the decimal point symbol,
with zero-filling and truncation as necessary.

There may be only one decimal point in each edited


picture clause.

The decimal point symbol cannot be mixed with either the


V (assumed decimal point) or the P (scaling position)
symbol.

Special Insertion examples In the examples/questions below, see if you can figure out
what result will be produced when the value in the
Sending item is moved to the editied picture in the
Receiving item. The description of the Sending item is
shown in the Picture column and its current value is shown
in the Data column.

Sending item Receiving item


Picture Data Picture Result
PIC PIC
12345
999V99 999.99
PIC 999V99 02345 PIC 999.9
PIC 999V99 71234 PIC 99.99
PIC 9(4) 2456 PIC 999.99

Fixed Insertion Fixed Insertion editing inserts the symbol at the beginning
or end of the edited item.

The Fixed Insertion editing symbols are:

the plus (+) and minus (-) signs,

The default currency symbol is the dollar sign ($) but it the letters CR and DB representing credit and
may be changed to a different symbol by the CURRENCY
SIGN IS clause, in the SPECIAL-NAMES paragraph, of the
debit,
CONFIGURATION SECTION, in the ENVIRONMENT DIVISION.
and the currency symbol usually the $ sign.

All symbols count toward the size of the printed item.

Plus and minus symbols


These must appear in the leftmost or rightmost character
positions and they count towards the size of the data item.
They must be the first or last character in the PICTURE
string.

Minus
If the sending item is negative, a minus sign is printed. If
the sending item is positive, a space is printed instead.
Use this to highlight negative values only.

Plus
If the sending item is negative, a minus in printed and if
the sending item is positive, a plus is inserted. Use this to
when you always want the sign printed.

CR and DB
CR and DB count towards the data item size and occupy
two character positions. They may only appear in the
rightmost position. Both are only printed if the sending
item is negative. Otherwise two spaces are printed.

The currency symbol (usually $).


The currency symbol must be the leftmost character and it
counts towards the size of the item. It may be preceded by
a plus or a minus sign.

Fixed Insertion examples Like the previous examples/questions above, see if you
can figure out what result will be produced when the
value in the Sending item is moved to the editied picture
in the Receiving item.

Sending item Receiving item


Picture Data Picture Result
PIC
-123 PIC -999
S999

PIC S999 -123 PIC 999-


PIC S999 +123 PIC -999
PIC S9(5) +12345 PIC +9(5)
PIC S9(3) -123 PIC +9(3)
PIC S9(3) -123 PIC 999+
PIC S9(4) +1234 PIC 9(4)CR
PIC S9(4) -1234 PIC 9(4)CR
PIC S9(4) +1234 PIC 9(4)DB
PIC S9(4) -1234 PIC 9(4)DB
PIC 9(4) 1234 PIC $99999
PIC 9(4) 0000 PIC $ZZZZZ

Floating Insertion The problem with using the fixed insertion symbols is that
they can be somewhat unsightly. Values like $0045,345.56
or -0012 are more acceptablely presented as $45,345.56
and -12.

What makes these formats more presentable is that the


leading zeros have been suppressed and the editing
symbol has been "floated" up against the first non-zero
digit. In COBOL this is achieved using Floating Insertion.

Floating Insertion suppresses leading zeros, and "floats"


the insertion symbol up against the first non-zero digit.

The Floating Insertion symbols are;


The plus and minus signs

and the currency symbol.

Every floating symbol counts toward the size of the


printed item.

Except for the left-most one, which is always printed,


each Floating Insertion symbol is a placeholder that may
be replaced by a digit. Accordingly, there will always be at
least one symbol printed, even though this may be at the
cost of truncating the number (see the fourth row in the
example below.)

Floating Insertion examples Like the previous examples/questions above, see if you
can figure out what result will be produced when the
value in the Sending item is moved to the editied picture
in the Receiving item.

Sending item Receiving item


Picture Data Picture Result
PIC $$,$
PIC 9(4) 0000
$9.99
PIC 9(4) 0080 PIC $$,$$9.00
PIC 9(4) 0128 PIC $$,$$9.99
PIC 9(5) 57397 PIC $$,$$9
PIC S9(4) -0005 PIC ++++9
PIC S9(4) +0080 PIC ++++9

PIC S9(4) -0080 PIC - - - - 9

PIC S9(5) +71234 PIC - - - - 9


Suppression and Replacement Editing

Introduction Suppression and replacement editing is used to remove


leading zeroes from the value to be edited. There are two
varieties of suppression and replacement editing-

Suppression of leading zeros and replacement


with spaces

Suppression of leading zeros and replacement


with asterisks

Notes
The characters Z and * are the suppression symbols.

Using Z in an editing picture, instructs the computer to


suppress a leading zero in that character position and
replace it with a space.

Using an * in an editing picture, instructs the computer to


suppress a leading zero in that character position and
replace it with an *.

If all the character positions in a data item are Z editing


symbols and the sending item is 0 then only spaces will
be printed.

If a Z or * is used, the picture clause symbol 9, cannot


appear to the left of it.
Suppression and Replacement editing
examples
Sending item Receiving item
Picture Data Picture Result
PIC 9(5) 12345 PIC ZZ,999

PIC 9(5) 01234 PIC ZZ,999


PIC 9(5) 00123 PIC ZZ,999
PIC 9(5) 00012 PIC ZZ,999
PIC 9(5) 05678 PIC **,**9
PIC 9(5) 00567 PIC **,**9
PIC 9(5) 00000 PIC **,***

PIC 9(5)V99 00043.45 PIC $**,**9.99


Picture string restrictions Some combinations of picture symbols are not
permitted. The table below shows the combination of
symbols that is allowed.

May be
Character followed
by
P P B 0 / ,
B + - CR DB
0 9 V
/ P B 0 / ,
, . + - CR
. DB 9 V
+ P B 0 / ,
- . + - CR
CR or DB DB 9 V
$ P B 0 / ,
9 . + - CR
V DB 9 V
P B 0 / ,
. + - CR
DB 9 V
B 0 / , .
+ - CR DB
9
P B 0 / ,
. + $ 9 V
P B 0 / ,
. - $ 9 V
Nothing
at all
P B 0 / ,
. + - CR
DB $ 9 V
P B 0 / ,
. + - CR
DB 9 V
B 0 / , +
- CR DB 9
The USAGE clause

Introduction

Aims The internal representation of data can be an important consideration for


program efficiency. Unfortunately the default representation used by COBOL
for numeric data items can negatively impact the speed of computations. A more
efficient format for numeric data can be specified by using the USAGE clause.

This unit introduces the concept of internal data representations, it discusses the
default representation used in COBOL and outlines how that representation,
used for numeric data, might cause inefficiencies.

The syntax of the USAGE clause is given and the various options explained.

The SYNCHRONIZED clause is introduced and a generalized example given.

Objectives By the end of this unit you should -

1. Know that text is stored in a computer using some encoding sequence.

2. Understand the problems caused by storing numeric data as ASCII


digits.

3. Be able to use use the USAGE clause to change the way numeric data is
stored in the computer.
Print files and variable-length records

Introduction

Aims This tutorial covers a number of topics including; COBOL print files, multiple
record-type files, variable length records, and run-time file name assignment.

Objectives By the end of this unit you should -

1. Understand how the records map on to the record buffer in a file


containing different types of record.

2. Be able to declare a file containing different types of record.

3. Understand the problems of declaring print-line records in the FILE


SECTION.

4. Be able to declare and use print files.

5. Know how to set up different kinds of variable length record.

6. Understand how variable length records, declared with the DEPENDING ON


phrase, work.

7. Be able set up a file so that the file name can be assigned at run-time

Sorting and Merging files

Introduction

Aims As we noted in the tutorial - Processing Sequential Files - it is possible to apply


processing to an ordered sequential file that is difficult, or impossible, when the file
is unordered.

When this kind of processing is required, and the data file we have to work with is an
unordered Sequential file, then part of the solution to the problem must be to sort the
file. COBOL provides the SORT verb for this purpose.

Sometimes, when two or more files are ordered on the same key field or fields, we
may want to combine them into one single ordered file. COBOL provides the MERGE
verb for this purpose.
Basic Procedure Division Commands

Introduction

Aims The PROCEDURE DIVISION contains the code used to manipulate the data described
in the DATA DIVISION. This tutorial examines some of the basic COBOL
commands used in the PROCEDURE DIVISION.

In the course of this tutorial you will examine how assignment is done in COBOL,
how the date and time can be obtained from the computer and how arithmetic
statements are written.

Objectives By the end of this unit you should -

1. Know how to get data from the keyboard and write it to the screen.

2. Know how to get the system date and time.

3. Understand how the MOVE is used for assignment in COBOL.

4. Understand how alphanumeric and numeric moves work.

5. Be able to use the arithmetic verbs to perform calculations.

Prerequisites Introduction to COBOL

Declaring data in COBOL


Introduction to Direct Access Files

Introduction

Aims The aim of this unit is to provide a gentle introduction to COBOL's direct access
file organizations and to equip you with a knowledge of the advantages and
disadvantages of each type of organization.

Objectives By the end of this unit you should:

1. Have a good understanding of the drawbacks of inserting, deleting and


amending records in an ordered Sequential file.

2. Have a basic knowledge of how Relative and Indexed files are organized.

3. Understand the advantages and disadvantages of the different file


organizations.

4. Be able to choose the appropriate file organization for a particular set of


circumstances .

Prerequisites You should be familiar with the material covered in the unit;

Sequential Files
To top of page

Sequential files

Introduction Access to records in a Sequential file is serial. To reach a particular record, all the
preceding records must be read.

As we observed when the topic was introduced earlier in the course, the
organization of an unordered Sequential file means it is only practical to read
records from the file and add records to the end of the file (OPEN..EXTEND). It is
not practical to delete or update records.

While it is possible to delete, update and insert records in an ordered Sequential


file, these operations have some drawbacks.

Problems accessing Records in an ordered Sequential file are arranged, in order, on some key field or
ordered Sequential fields. When we want to insert,delete or amend a record we must preserve the
files. ordering. The only way to do this is to create a new file. In the case of an
insertion or update, the new file will contain the inserted or updated record. In the
case of a deletion, the deleted record will be missing from the new file.

The main drawback to inserting, deleting or amending records in an ordered


Sequential file is that the entire file must be read and then the records written to a
new file. Since disk access is one of the slowest things we can do in computing
this is very wasteful of computer time when only a few records are involved.

For instance, if 10 records are to be inserted into a 10,000 record file, then 10,000
records will have to be read from the old file and 10,010 written to the new file.
The average time to insert a new record will thus be very great.
Contrast with direct While Sequential files have a number of advantages over other types of file
access files organization (and these are discussed fully in the final section) the fact that a new
file must be created when we delete, update or insert a records causes problems.

These problems are addressed by direct access files. Direct access files allow us
to read, update, delete and insert individual records, in situ, using a key.

Inserting records in To insert a record in an ordered Sequential file:


an ordered
Sequential file 1. All the records with a key value less than the record to be inserted must be
read and then written to the new file.

2. Then the record to be inserted must be written to the new file.

3. Finally, the remaining records must be written to the new file.

Deleting records To delete a record in an ordered Sequential file:


from an ordered
Sequential file 1. All the records with a key value less than the record to be deleted must be
written to the new file.

2. When the record to be deleted is encountered it is not written to the new


file.

3. Finally, all the remaining records must be written to the new file.

Amending records in To amend a record in an ordered Sequential file:


an ordered
Sequential file 1. All the records with a key value less than the record to be amended must
be read and then written to the new file.

2. Then the record to be amended must be read the amendments applied to it


and the amended record must then be written to the new file.

3. Finally, all the remaining records must be written to the new file.
Sequential files - The Sequential file
animation animation opposite
shows the operations of
inserting, deleting and
amending a record in an
ordered Sequential file.

Summary The problem with Sequential files is that unless the file is ordered very few (only
read and add) operations can be applied to it.

Even when a Sequential file is ordered; delete, insert and amend operations are
prohibitively expensive (in processing terms) when only a few records in the file
are affected (i.e. when the "hit rate" is low) .

To top of page

Relative Files

Introduction As we have already noted, the problem with Sequential files is that access to the
records is serial. To reach a particular record, all the proceeding records must be
read.

Direct access files allow direct access to a particular record in the file using a key
and this greatly facilitates the operations of reading, deleting, updating and
inserting records.

COBOL supports two kinds of direct access file organizations -Relative and
Indexed.
Organization of Records in relative files are organized on ascending Relative Record Number. A
Relative files Relative file may be visualized as a one dimension table stored on disk, where the
Relative Record Number is the index into the table. Relative files support
sequential access by allowing the active records to be read one after another.

Relative files support only one key. The key must be numeric and must take a
value between 1 and the current highest Relative Record Number. Enough room
is allocated to the file to contain records with Relative Record Numbers between
1 and the highest record number.

For instance, if the highest relative record number used is 10,000 then room for
10,000 records is allocated to the file.

Figure 1 below contains a schematic representation of a Relative file. In this


example, enough room has been allocated on disk for 328 records. But although
there is room for 328 records in the current allocation, not all the record locations
contain records. The record areas labeled "free", have not yet had record values
written to them.

Relative File - Organization


Figure 1
Accessing records in To access a records in a Relative file a Relative Record Number must be
a Relative file provided. Supplying this number allows the record to be accessed directly
because the system can use

the start position of the file on disk,


the size of the record,
and the Relative Record Number

to calculate the position of the record.

Because the file management system only has to make a few calculations to find
the record position the Relative file organization is the fastest of the two direct
access file organizations available in COBOL. It is also the most storage efficient.

Animation - To read, insert, delete or


Organization and use update a record directly, the
of Relative files Relative Record Number of
the record must be placed in
the key area and then the
operation must be applied to
the file.

Click on the diagram


opposite to see these
operations in action.

Summary A Relative file is organized like a one dimension table on disk where each record
is an element of the table. The Relative Record Number acts like a table index to
allow access to the records.
To top of page

Indexed Files

Introduction While the usefulness of a Relative file is constrained by its restrictive key,
Indexed files suffer from no such limitation.

Indexed files may have up to 255 keys, the keys can be alphanumeric and only
the primary key must be unique.

In addition, it is possible to read an Indexed file sequentially on any of its keys.

Organization of An Indexed file may have multiple keys. The key upon which the data records are
Indexed files ordered is called the primary key. The other keys are called alternate keys.

Records in the Indexed file are sequenced on ascending primary key. Over the
actual data records, the file system builds an index. When direct access is
required, the file system uses this index to find, read, insert, update or delete, the
required record.

For each of the alternate keys specified in an Indexed file, an alternate index is
built. However, the lowest level of an alternate index does not contain actual data
records. Instead, this level made up of base records which contain only the
alternate key value and a pointer to where the actual record is. These base records
are organized in ascending alternate key order.

As well as allowing direct access to records on the primary key or any of the 254
alternate keys, indexed files may also be processed sequentially. When processed
sequentially, the records may be read in ascending order on the primary key or on
any of the alternate keys.

Since the data records are in held in ascending primary key sequence it is easy to
see how the file may be accessed sequentially on the primary key. It is not quite
so obvious how sequential on the alternate keys is achieved. This is covered in
the unit on Indexed files.

Animation - In the animation below you can see a representation of an Indexed file and its
Organization and use overlying primary key index. Note that the index records point to the actual data
of Indexed files records which are held in ascending primary key sequence.

This animation shows how a direct read on the primary key is done. The record
to be read has a key value of "Ni". In the animation we see how the index is used
to find the required record we.

The algorithm used for traversing the index is -

IF RecordKeyValue > RequiredKeyValue


take this branch
ELSE
go to next index record
END-IF

Summary An Indexed file may have multiple, alphanumeric, keys.

Only the primary key must be unique.

For each key specified for an Indexed file, an index will be built.

To top of page

Comparison of COBOL file organizations


Introduction In this section we examine the advantages and disadvantages of the three
COBOL file organizations.

Sequential files Disadvantages

The "hit rate" refers to the number of records in the file


that are affected when updating a file. For instance, if
only 100 records are to affected by an insert, delete or
amend operation in a file of 10,000 records then the hit
rate is low. But if 9,000 records are affected then the hit
Slow
rate is high.

Sequential files are very slow to update when the hit rate
is low because the entire file must be read and then
written to a new file, just to update a few records.

Sequential files are also complicated to change. Changes


to Sequential files are batched together into a transaction
file to minimize the low hit rate problem but this makes
updating the Sequential file much more complicated than
Complicated updating a direct access file.

The complications come from having to match the


records in the transaction file with those in the master
file (i.e. the file to be updated).

Advantages

When the hit rate is high this is the fastest method of


updating a file because the record position does not
Fast
have to be calculated and no indexes have to be
traversed.

Most storage
efficient This is the most storage efficient of all the file
organizations. No indexes are required. Space from
deleted records is recovered. Only room actually
required to hold the records is allocated to the file.

This is the simplest file organization. Records are held


Simple organization
serially.

Sequential files may be stored on serial media such as


Files may be stored magnetica tape.
on serial media
These media are cheap, removable and voluminous.

Relative files Disadvantages

If the file is only partially populated with records then this


is a very wasteful file organization. The file will be
allocated enough room to hold records from 1 to the
highest Relative Record Number used, even if only a few
records have actually been written to the file.
Wastes storage if
the file is only For instance, if the first record written to the file has a
partially populated Relative Record Number of 10,000 then room for that
many records is allocated to the file.

The fact that there is only one key and that it must be
numeric and must take a value between 1 and the highest
record number is limiting.

Cannot recover
space from deleted Relative files cannot recover the space from deleted
records records.

When a record is deleted in a Relative file, it is simply


marked as deleted but the actual space that used to be
occupied by the record is still allocated to the file (see
record position 327 in Figure-1).

So if a Relative file is 560K in size when full, it will still


be 560K when you have deleted half the records.

The usefulness of Relative files is severely constrained by


the fact that;

they can only have one key,

the key must be numeric


Only a single key
allowed the key must have a value in the range - 1 to the
highest key value.

The single key is limiting because it is often the case that


we need to access the file on more than one key. For
instance, in a file of student records we might want to
access the records on Student Id, Student Name,
CourseCode or ModuleCode.

The mention of using StudentName, CourseCode or


ModuleCode as a key, highlights another drawback with
The key must be Relative files.
numeric
We frequently need to access the file using a key that is
not numeric.

The key is
inflexible The fact that the key must be in the range 1 to the highest
key value and that the file system allocates space for all
the records between 1 and the highest Relative Record
Number used, imposes severe constraints on the key.

For instance even though the StudentId is numeric we


couldn't use it as a key because the file system would
allocate space for records from 1 to the highest StudentId
written to the file.

Suppose the highest StudentId written to the file was


9876543. The file system would allocate space for
9,876,543 records.

Sometimes we can get around the limitations of the key


by using a transformation function to map the actual key
on to the range of Relative Record Numbers.

There are a number of possible transformation, or


"hashing", functions, including truncation (only using
some of the digits in the key as the Relative Record
Number), folding (breaking the key into two or more parts
and summing the parts), digit manipulation (manipulating
some of the digits in the key to produce a Relative Record
Number) and modulus division (using the remainder of a
division operation as the Relative Record Number).

Some transformation functions might even allow keys that


are not numeric.

Some transformation functions require special code to


deal with duplication that occurs when the transformation
of two different keys produces the same Relative Record
Number.

Because Relative files are direct access files they must be


Must be stored on
stored on direct access media such as a hard or floppy
direct access media
disks. They can not be stored on magnetic tape.

Advantages

This is the fastest direct access organization. To reach a


Fastest direct
particular record, only a few simple calculations have to
access organization
be done.

Very little storage


overhead Unlike Indexed files, which must store the indexes as well
as the data, Relative files have only a small storage
overhead.

Can be read As well as allowing direct access, Relative files allow


sequentially sequential access to the records in the file.

Indexed files Disadvantages

Because Indexed file achieve direct access by traversing a


number of levels of index this is the slowest direct access
organization.

Indexed files must have a primary key index and an index


for each alternate key.

Each level of index implies an I/O operation on the hard


Slowest direct disk.
access organization
For instance, in the Indexed file animation three I/O
operations were required to read the record (two for the
index records and one for data record).

Because of this, Indexed files are substantially slower


than Relative files. They are especially slow when writing
or deleting records because the primary key index and the
alternate key indexes may need to be rebuilt.

Not very storage


efficient Indexed files require more storage than other file
organizations because file must store;

the primary key Index records

the index records for each alternate key


the actual data records

the base records for each alternate key

In addition, the space from deleted records is only


partially recovered.

Because Indexed files are direct access files they must be


Must be stored on
stored on direct access media such as a hard or floppy
direct access media
disks. They cannot be stored on magnetic tape.

Advantages

Indexed files can have multiple alphanumeric keys and


only the primary key has to be unique.

An indexed file may be read sequentially on any of its


keys.

Versatile keys When we compare the number of disadvantages of


Indexed files with the advantages we might be forgiven
for thinking "Why would we ever use Indexed files?".

But the versatility of its keys overrides all its


disadvantages with the result that Indexed files are the
most widely used direct access file organization.

Summary There is no one best file organization. Choosing an appropriate file organization
is a case of "horses for courses".

If the hit rate is high and we have no need for direct access to the records, a
Sequential file might be best.

If the hit rate is low or if we require direct access to the records but one numeric
key is sufficient, a Relative file might be the best choice.
If we need direct access on a number of keys or if the key must be alphanumeric,
then we must choose an Indexed file.

Relative Files

Introduction

Aims The aim of this unit is to provide you with a solid understanding of declarations
required for Relative files and the Procedure Division verbs used to process them.

An additional aim is to introduce you to the uses of Declaratives and declarations


required for them.

Objectives By the end of this unit you should:

1. Be able to write the Environment Division and Data Division declarations


required for a Relative file.

2. Understand the difference between a file's access mode and its


organization.
3. Be able to used the use the START, OPEN, CLOSE, READ, WRITE, REWRITE
and DELETE Procedure Division verbs required to process Relative files.

4. Be able to process a Relative file directly or sequentially.

5. Understand when, and how, to use Declaratives.

Prerequisites You should be familiar with the material covered in the unit;

Introduction to direct access files

To top of page

A look at some programs that use Relative files

Example program - We'll start by looking at some example programs. In the first program, a Relative
Creating a Relative file is created by reading records from a Sequential file and writing them to the
file Relative file. The second program demonstrates how a Relative file may be read
directly and sequentially.

It is a good idea to download the program and use the animator to watch how the
Relative file is created. You can download both the program and the sequential
data file it uses.

Download Relative file example program 1


Download the Sequential data file

Example program - In this example we see how a Relative file may be read. The program allows the
Reading a Relative file to be read either directly, using a key, or sequentially.
file

Below we see the results produced by two runs of the program.

RUN USING SEQUENTIAL READING


Enter Read type (Direct=1, Seq=2)-> 2
01 VESTRON VIDEOS OVER THE SEA SOMEWHERE IN LONDON
02 EMI STUDIOS HOLLYWOOD, CALIFORNIA, USA
03 BBC WILDLIFE BUSH HOUSE, LONDON, ENGLAND
04 CBS STUDIOS HOLLYWOOD, CALIFORNIA, USA
05 YACHTING MONTHLY TREE HOUSE, LONDON, ENGLAND
06 VIRGIN VIDEOS IS THIS ONE ALSO LOCATED IN ENGLAND
07 CIC VIDEOS NEW YORK PLAZZA, NEW YORK, USA

RUN USING DIRECT READ


Enter Read type (Direct=1, Seq=2)-> 1
Enter supplier key (2 digits)-> 05
05 YACHTING MONTHLY TREE HOUSE, LONDON, ENGLAND

If you downloaded the first example program and its data file you should already
have the Relative file (it was produced when you ran the first example program).
You can use this file with the second Relative file example program.

Download Relative file example program 2

To top of page
Relative file - Declarations

Introduction As we have seen in the example programs when Relative files are used a number
of new entries for the SELECT and ASSIGN clause are required.

Select and Assign


clause syntax

The Optional phrase The OPTIONAL phrase must be specified for files opened for INPUT, I-O, or
EXTEND that need not be present when the program runs.

The Access Mode The ACCESS MODE of a file refers to the way in which the file is to be used. If an
ACCESS MODE of SEQUENTIAL is specified then it will only be possible to process
the records in the file sequentially. If RANDOM is specified it will only be possible
to access the file directly. If DYNAMIC is specified, the file may be accessed both
directly and sequentially.

The Record Key The RECORD KEY phrase is used to define the relative key. There can be only one
phrase key in a Relative File. The UniqueRecKey must be a numeric data item and must
not be part of the file's record description although it may be part of another file's
record description. It is normally described in the WORKING-STORAGE SECTION.
The File Status The FILE STATUS clause identifies a two character area of storage that holds the
result of every I-O operation for the file. The FILE STATUS data item is declared as
PIC X(2) in the WORKING-STORAGE SECTION. Whenever an I-O operation is
performed, some value will be returned to FileStatus indicating whether or not
the operation was successful.

There are a large number of FILE STATUS values but three of major interest are;
00 = Operation successful.

22 = Duplicate key. i.e. The record already exists.

23 = Record not found

22 may occur after an attempt to write a record and 23 after trying to READ or
DELETE a record.

Note that although a code of 00 generally means the operation was successful
there are other codes that also indicate success.

To top of page

Relative file - file processing verbs

Introduction Direct access files can support a far greater range of operations than Sequential
files. Just like Sequential files, direct access files support the OPEN, CLOSE, READ
and WRITE operations. But in addition to these, direct access files also support the
DELETE, REWRITE and START operations.

In this section we examine these new operations and any changes to the
operations we are already familiar with.
The Invalid Key When any of the file processing verbs are used for direct access, the INVALID KEY
clause clause must be used unless declaratives have been specified.

When the INVALID KEY clause is specified, any I-O error, such as attempting to
read a record that does not exist or write a record that already exists, will activate
the clause and cause the statement block following it to be executed.

OPEN/CLOSE verbs The syntax for the CLOSE is the same for all file organizations.
The full syntax for the OPEN verbs is shown below. Note the new I-O entry. This
is used with direct access files when we intend to update or both read and write to
the file.

Notes
If the file is opened for input then only READ and START will be allowed.

If the file is opened for output then only WRITE will be allowed.

If the file is opened for I-O then READ, WRITE, START, REWRITE and DELETE will
be allowed.

If OPEN INPUT is used, and the file does not possess the OPTIONAL tag, then the
file must exist or the OPEN will fail.

If OPEN OUTPUT or I-O is used then the file will be created if it does not already
exist as long as the file possesses the OPTIONAL tag.

The READ verb When a Relative file has an ACCESS MODE of SEQUENTIAL, the format of the
READ is the same as for Sequential files, but when the ACCESS MODE is DYNAMIC
or RANDOM, the READ has a different format.
Reading using a key

Operation
To read a record directly from a Relative file

1. The key value must be placed in the KeyName data item (the KeyName
data item is the area of storage identified as the relative key in the
RECORD KEY IS phrase of the SELECT and ASSIGN clause).

2. Then the READ must be executed.

When the READ is executed, the record with the Relative Record Number equal
to the present value of the relative key (i.e.KeyName) will be read into the file's
record buffer (defined in the FD entry).

If the record does not exist the INVALID KEY clause will activate and the
statement block following the clause will be executed.

After the record has been read the next record pointer will be pointing to the next
record in the file.

Notes
The file must have an ACCESS MODE declaration for the file specifying an
ACCESS MODE DYNAMIC or RANDOM.

The file must be opened for I-O or INPUT.

Reading Sequentially When the ACCESS MODE is DYNAMIC and we wish to read the file sequentially
then we must use the format below for the READ. There is very little difference
between this format and the format of the ordinary sequential READ except that in
this format the NEXT RECORD phrase is used.
Notes
This format is used to access a Relative file sequentially when the ACCESS MODE
has been declared as DYNAMIC and the file has been opened for INPUT or I-O.

For Relative files the next record pointer may be positioned by the START verb or
by doing a direct READ.

The READ NEXT will read the record pointed to by the next record pointer (This
will be the current record if positioned by the START and the next record if
positioned by a direct READ).

The AT END statement is activated when the end of the file has been reached

The WRITE verb The format for writing sequentially to a Relative file is the same as that used for
writing to a Sequential file, but to write directly to a Relative file a key must be
used and this requires a different WRITE format.

Operation
To write a record directly to a Relative file

1. The record value must be placed in the files record buffer.

2. The key value must be placed in the file's relative key.

3. The WRITE must be executed.

When the WRITE is executed the data in the record buffer is written to the record
position with a Relative Record Number equal to the present value of the key.

Notes
The INVALID KEY clause must be used for direct access to Relative files. If the
record being written already 'exists' the INVALID KEY clause will be activate and
the statement following the clause will be done. Any other I-O error will also
activate the INVALID KEY clause.

The REWRITE verb The REWRITE is used to update a record in situ by overwriting it. The format of
the REWRITE verb is shown below.

Operation
The REWRITE is normally used in the following way;

1. The record we wish to update is read directly into the record buffer (place
the key value in the key are and execute the READ).

2. The required changes are made to the record in the buffer.

3. The record in the buffer is rewritten to the file.

Notes
If the file has an ACCESS MODE of SEQUENTIAL then the INVALID KEY clause
cannot be used but if the ACCESS MODE is RANDOM or DYNAMIC the INVALID
KEY clause must be present (unless declaratives are being used).

When the file has ACCESS MODE IS SEQUENTIAL the record that is replaced must
have been the subject of a READ or START before the REWRITE is used.

For all access modes the file must be opened for I-O.

The DELETE verb

Operation
To delete a record

1. The Relative Record Number of the record to be deleted must be placed


in the file's relative key.

2. Then the DELETE must be executed.

The record with the Relative Record number equal to the current value of the key
are will be deleted.

Notes
To use the DELETE, the file must have been opened for I-O.

When the ACCESS MODE IS SEQUENTIAL a READ statement must access the
record to be deleted.

When the ACCESS MODE IS RANDOM or DYNAMIC the record to be deleted is


identified by the file's Relative key.

If the record does not exist the INVALID KEY statement will be activated.

Note that when a record is deleted its space does not become available to other
records.

The START verb In Relative files, the only thing the the START verb is used for, is to control the
position of the next record pointer. Where the START verb appears in a program it
is usually followed by a sequential READ or WRITE.

Operation
To position the Next Record Pointer at a particular record
1. Move the Relative Record Number of the record to the file's relative key.

2. Execute the START..KEY IS EQUAL TO

To position the Next Record Pointer at the first record in the file

1. Move zeros to the file's relative key.

2. Execute the START..KEY IS GREATER THAN

To position the pointer at the last record in the file

1. Move all 9's to the file's relative key.

2. Execute the START..KEY IS LESS THAN

Notes
KeyDataName is the file's relative key. It is the key of comparison.

The file must be opened for INPUT or I-O when the START is executed.

Execution of the START statement does not change the contents of the record area
(i.e. the START does not actually read the record it merely positions the next
record pointer).

When the START is executed the Next Record Pointer is set to the first record in the
file whose key satisfies the condition. If no record satisfies the condition then the
INVALID KEY clause is activated.

To top of page

Introduction to Declaratives

Introduction Declaratives allow us to define exception handling procedures for files. When
there are declaratives for a file the INVALID KEY clause is not required.
The Declaratives The template opposite shows how
template declaratives are set up. PROCEDURE DIVISION.
DECLARATIVES.
Declaratives must be defined at the start SectionOne SECTION.
USE clause for file1.
of the PROCEDURE DIVISION. ParSectOne1.
????????????????
They must start with the DECLARATIVES ????????????????
keyword and end with the keyword ParSectOne2.
????????????????
END-DECLARATIVES.
????????????????

Declaratives are divided into sections SectionTwo SECTION.


and each section is associated with a USE clause for file2.
particular USE phrase. ParSectTwo1.
????????????????
????????????????
The sections and USE phrases allow us ParSectTwo2.
to specify a separate exception ????????????????
procedure for each file. ????????????????
END-DECLARATIVES.
Main SECTION.
We can also specify general exception Begin.
procedures for all INPUT, OUTPUT, I-O
and EXTEND errors.

The Use phrase

Notes

A USE statement can be used only in a sentence immediately after a section


header in the PROCEDURE DIVISION declaratives area. It must be the only
statement in the sentence.

ERROR and EXCEPTION are synonyms.


A Declarative cannot refer to a non-Declarative procedure. However, the
PERFORM can transfer control from a Declarative procedure to another
Declarative procedure or from a non-Declarative procedure to a Declarative
procedure.

A Declarative executes automatically whenever an I-O condition occurs that


would result in a non-zero value in the FILE STATUS data item.

When USE phrases refer to both a FileName and the more general INPUT,
OUTPUT, I-O and EXTEND then the FileName procedure takes precedence.

PROCEDURE DIVISION.
DECLARATIVES.
Example program - FileError SECTION.
fragment USE AFTER ERROR PROCEDURE ON RelativeFile.
CheckFileStatus.
EVALUATE TRUE
WHEN RecordDoesNotExist DISPLAY "Record does not exist"
WHEN RecordAlreadyExists DISPLAY "Record already exists"
WHEN FileNotOpen OPEN I-O RelativeFile
END-EVALUATE.
END-DECLARATIVES.
Main SECTION.
Begin.
Using Tables

Introduction

Aims In most programming languages the term "array" is used to describe repeated, or
multiple- occurrence, data-items. COBOL uses the term - "table".

In this tutorial you will discover why a table, or array, is a useful structure for
solving certain kinds of problems.

Objectives By the end of this unit you should -

1. Understand why you might want to use a table as part of your solution to
Creating Tables - syntax and semantics

Introduction

Aims In the "Using Tables" tutorial we examined why we might want to use tables as part
of the solution to a programming problem.

In this tutorial we examine the syntax and semantics of table declaration. We


demonstrate how to create single and multi-dimension tables, how to create variable
length tables and how to set up a table pre-filled with data..

Objectives By the end of this unit you should -

1. Understand how the OCCURS clause is used in a table declaration.

2. Know how to use a subscript to access the elements of a table.


What happens when the statement MOVE 456 TO
Q2
DeptTotal(5) is executed?
What statement would we have to write to display the
Q3
second VAT rate in the VATRateTable?
What is the purpose of the level 88 attached to the
Q4
RatesTable?

Declaring a variable length A variable-length table may be declared using the OCCURS clause syntax
table. below.

OCCURS SmallestSize#l TO LargestSize#l TIMES


DEPENDING ON ActualSize#i

The amount of storage allocated to a variable-length table is defined by the


value of LargestSize and is assigned at compile time. Standard COBOL
has no mechanism for dynamic memory allocation, although the coming
OO-COBOL standard addresses this problem.

Rules

1. ActualSize#i cannot itself be a data-item in a table.

2. This format may only be use to vary the number of elements in the
first dimension of a table.

Example

01 BooksReservedTable.
02 BookId PIC 9(7) OCCURS 1 TO 10
DEPENDING ON NumOfReservations.

Group items as elements

Introduction The elements of a table do not have to be elementary items. An element


can be a group item. In other words each element can be subdivided into
two, or more, subordinate items.
Setting up a combined Let's illustrate this with an example. Suppose the specification of the
table County Tax Report program were changed to say that, as well as the
amount of tax paid in each county, the report should also display the
number of tax payers.

One way we could satisfy this change to the specification would involve
setting up separate tables to hold the county tax totals and county tax payer
counts. For example-

01 CountyTaxTable.
02 CountyTax PIC 9(8)V99 OCCURS 26 TIMES.

01 CountyPayerTable.
02 PayerCount PIC 9(7) OCCURS 26 TIMES.

But we could also solve the problem by setting up just a single table and
defining each element as a group item. The group item would consist of
the CountyTax and the PayerCount.

For example -

01 CountyTaxTable.
02 CountyTaxDetails OCCURS 26 TIMES.
03 CountyTax PIC 9(8)V99.
03 PayerCount PIC 9(7).

In this example, CountyTaxTable is the name for the whole table and each
element is called CountyTaxDetails. The element is further subdivided into
the elementary items CountyTax and PayerCount.

We can represent this table diagrammatically as follows;

To refer to an item that is subordinate to table element, the same number of


subscripts must be used as when referring to the element itself. So, to refer
to CountyTaxDetails, CountyTax and PayerCount in the table above, the
form CountyTaxDetails(sub), CountyTax(sub) and PayerCount(sub) must
be used.
Self Assessment Questions Q1. Examine the CountyTaxTable described above. Suppose that in our
and animated example program we have the following statements;

MOVE ZEROS TO CountyTax(3)


MOVE 67 TO PayerCount(6)
MOVE 1234.45 TO CountyTax(4)
MOVE ZEROS TO CountyTaxDetails(4)
MOVE ZEROS TO CountyTaxTable
What effect on the table will each of these statements have? Copy the table
diagram above, fill in your answer and then click on the animation icon
below to see the animated solution.

Q2. What is the difference between the following data descriptions;

01 CountyTaxTable1.
02 CountyTaxDetails OCCURS 26 TIMES.
03 CountyTax PIC 9(8)V99.
03 PayerCount PIC 9(7).

01 CountyTaxTable2.
02 CountyTaxDetails
03 CountyTax PIC 9(8)V99 OCCURS 26 TIMES.
03 PayerCount PIC 9(7)OCCURS 26 TIMES.
Example program This example program implements the County Tax Report program. It has
some interesting features;

Each element of the table is a group item subdivided into CountyTax


and PayerCount.

The CountyTax table is subordinate to the data-item CountyTaxTable


Searching Tables

Introduction

Aims The task of searching a table to determine whether it contains a particular value is
a common operation. The method used to search a table depends heavily on how
the values in the table are organized.

For instance, if the values are not ordered, the table may only be searched
sequentially, but if the values are ordered, then either a sequential or a binary
search may be used.

In COBOL, the SEARCH verb is used for sequential searches and the SEARCH ALL
for binary searches.

This tutorial introduces the syntax and rules of operation of the SEARCH and
SEARCH ALL verbs. It introduces the extensions to the OCCURS clause that are
required when the SEARCH or SEARCH ALL is used. It shows how the SET verb
Subprograms

Introduction

Aims To provide a brief introduction to building modular systems using separately


compiled and/or contained subprograms.

To explore the syntax, semantics, and use of the CALL verb.

To examine how "state memory" is implemented in COBOL and to demonstrate


the effects of the IS INITIAL clause and the CANCEL command.

To examine how contained subprograms are implemented and to show how the IS
GLOBAL clause may be used to implement a form of "Information Hiding".

To show how the IS EXTERNAL clause may be used to set up an area of storage
that may be accessed by any program in the run-unit.

To introduce Structured Design and to summarize its criteria for achieving good
quality subprograms.

Objectives By the end of this unit you should -

1. Understand the difference between a subprogram and a contained


The COPY verb

Introduction

Aims In a large software system it is often very useful to be able keep record, file and
table descriptions in a central source text library and then to import those
descriptions into the programs that require them. In COBOL the COPY verb
allows us to do this.

This unit introduces the COPY verb.

It describes some problems of large software systems which use of the COPY
helps to alleviate.

It introduces the syntax of the COPY verb and discusses how the COPY verb, and
in particular they COPY..REPLACING, works.

It introduces the notion of a text word and describes the role that these play in
matching the text in the REPLACING phrase with the text in the library file.

The unit ends with a number of example programs.

Objectives By the end of this unit you should -


The INSPECT verb

Introduction

Aims Many programming languages rely on library functions for string handling.
COBOL too, uses Intrinsic Functions for some tasks but most string manipulation
is done using Reference Modification and the three string handling verbs :
INSPECT, STRING and UNSTRING.

This unit examines the first of three string handling verbs and aims to provide
you with a solid understanding of its various formats and modifying phrases.

Objectives By the end of this unit you should:

1. Understand how the INSPECT verb works.

2. Have a good knowledge of the different formats and modifying phrases of


the INSPECT verb.

3. Be able to use the INSPECT to count, replace and convert characters in a


string.
Prerequisites You should be familiar with the following material;

Data Declaration

Iteration

Selection

Tables

Edited Pictures

Syntax legend To simplify the syntax diagrams and reduce the number of rules that must be
explained, special operand endings are used in this unit's syntax diagrams.

These operand endings have the following meaning:

$i uses an alphanumeric data item


$il uses an alphanumeric data item or a string literal
#i uses a numeric data item
#il uses a numeric data item or numeric literal
$#i uses a numeric or an alphanumeric data item

To top of page

Using the INSPECT

Introduction The INSPECT has four formats;


1. The first format is used for counting characters in a string.

2. The second replaces a group of characters in a stringwith another group of


characters.

3. The third combines both operations in one statement.

4. The fourth format converts each of a set of characters to its corresponding


character in another set of characters.

Example program We'll start by looking at the PROCEDURE DIVISION of an example program that
Counting letter uses the INSPECT verb. This program reads through a file, counts how many
occurences using the times each letter of the alphabet occurs and displays the results
INSPECT
To do this the program;

Reads in a line of text

Converts all the characters to upper case using the INSPECT..CONVERTING

Counts the number of times each letter appears in the line using the
INSPECT..TALLYING and increments the count in LetterCount. It gets the
letters from inspection from a pre-defined table of letters (see the unit on
tables for the definition of this letter table)

Uses a PERFORM..VARYING to display the counts after they have been


accumulated.

PROCEDURE DIVISION.
Begin.
OPEN INPUT TextFile
READ TextFile
AT END SET EndOfFile TO TRUE
END-READ
PERFORM UNTIL EndOfFile
INSPECT TextLine
CONVERTING "abcdefghijklmnopqrstuvwxyz"
TO "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
PERFORM VARYING idx FROM 1 BY 1 UNTIL idx > 26
INSPECT TextLine TALLYING LetterCount(idx)
FOR ALL Letter(idx)
END-PERFORM
READ TextFile
AT END SET EndOfFile TO TRUE
END-READ
END-PERFORM
PERFORM VARYING idx FROM 1 BY 1 UNTIL idx > 26
DISPLAY "Letter " Letter(idx)
" occurs " LetterCount(idx) " times"
END-PERFORM
CLOSE TextFile
STOP RUN.

How the INSPECT The INSPECT scans the source string from left to right counting, replacing or
works. converting characters under the control of the TALLYING, REPLACING or
CONVERTING phrases.

The behavior of the INSPECT is modified by using the LEADING, FIRST, BEFORE
and AFTER phrases.

An ALL, LEADING, CHARACTERS, FIRST or CONVERTING phrase may only be


followed by one BEFORE and one AFTER phrase.

The modifying The LEADING phrase causes counting/replacement of all Compare$il characters
phrases from the first valid one encountered to the first invalid one.

The FIRST phrase causes only the first valid character to be replaced.

The BEFORE phrase designates as valid those characters to the left of the
delimiter associated with it.

The AFTER phrase designates as valid those characters to the right of the
delimiter associated with it.

If the delimiter is not present in the SourceStr$i then using the BEFORE phrase
implies the whole string and using the AFTER phrase implies no characters at all.

To top of page

The counting INSPECT


Syntax

This format of the Inspect is used to count characters in a string.

Notes If Compare$il or Delim$il is a figurative constant it is 1 character in size.

An ALL, LEADING or CHARACTERS phrase can have not more than one BEFORE
and one AFTER phrase following it.

INSPECT FullName TALLYING UnstrPtr FOR LEADING SPACES.


Examples INSPECT SourceLine TALLYING ECount
FOR ALL "e" AFTER INITIAL "start"
BEFORE INITIAL "end".

To top of page

The replacing INSPECT

Syntax
This format of the INSPECT is used to count characters in a string.

Rules The sizes of Compare$il and Replace$il must be equal.

When Replace$il is a figurative constant its size equals that of Compare$il.

When there is a CHARACTERS phrase, the size of ReplaceChar$il and the


delimiter which may follow it (Delim$il) must be one character.

If Compare$il, Delim$il or Replace$il is a figurative constant it is 1 character in


size.

Example 1 The following examples work on the data in StringData to produce the results
with animation shown in the storage schematic below.

Click on the storage schematic to see the result of each of these INSPECT
statements (use left click or PageDown for next item and right click or PageUp
for previsou item) .

1. INSPECT StringData REPLACING ALL "R" BY "G"


AFTER INITIAL "A" BEFORE INITIAL "Q".

2. INSPECT StringData REPLACING LEADING "R" BY "G"


AFTER INITIAL "A" BEFORE INITIAL "Z".

3. INSPECT StringData REPLACING ALL "R" BY "G"


AFTER INITIAL "A" BEFORE INITIAL "Z".
4. INSPECT StringData REPLACING FIRST "R" BY "G"
AFTER INITIAL "A" BEFORE INITIAL "Q".

5. INSPECT StringData REPLACING


ALL "RRRR" BY "FROG"
AFTER INITIAL "A" BEFORE INITIAL "Q".

Example 2 This example inspects the string TextLine and checks it for each of the 4 letter
swear words in the table. If one is found it is replaced by the text *#@!.

PERFORM VARYING idx FROM 1 BY 1 UNTIL idx > 10


INSPECT TextLine REPLACING SwearWord(idx) BY "*#@!"
END-PERFORM

Example 3 In this example we want to create an edited item with a floating Rouble currency
with animation symbol instead of a floating dollar sign.

To achieve we use a picture clause edited with a floating dollar sign and then we
use the INSPECT to replace the dollar sign with the Rouble symbol "R".

This works because when a value is moved into an edited item the editing is
immediately applied to the value.

Click on the diagram below to see how this use of the INSPECT works (use left
click or PageDown for next item and right click or PageUp for previsou item) .

To top of page

The combined INSPECT

Syntax
This format of the INSPECT combines the syntactic elements of the previous two formats allowing both
counting and replacing to be done in one statement.

To top of page

The converting INSPECT

Syntax
How this format of The INSPECT..CONVERTING works on individual characters. Compare$il is a list
the INSPECT works of characters that will be replaced with the characters in Convert$il on a one for
one basis.

For instance the statement -

INSPECT StringData CONVERTING "abc" TO "XYZ".

replaces "a" with "X", "b" with "Y" and "c" with "Z".

Rules 1. Compare$il and Convert$il must be equal in size.

2. When Convert$il is a figurative constant its size equals that of


Compare$il.

3. The same character cannot appear more than once in the Compare$il
(because each character in the Compare$il string is associated with a
replacement).

Example 1 The following examples work on the data in StringData to produce the results
with animation shown in the storage schematic below.

Click on the storage schematic to see the result of each of these INSPECT
statements (use left click or PageDown for next item and right click or PageUp
for previsou item).

1. INSPECT StringData CONVERTING "fxtd" TO "ZYAB".

2. INSPECT StringData REPLACING "fxtd" BY "ZYAB".

3. INSPECT StringData REPLACING ALL "x" BY "Y"


"d" BY "Z"
"f" BY "S".

Example 2 This example shows how the INSPECT..CONVERTING can be used to implement a
simple encoding mechanism.

It converts the character 0 to character 5, 1 to 2, 2 to 9, 3 to 8 etc. Conversion


starts when the word "codeon" is encountered in the string and stops when
"codeoff" is encountered.

INSPECT TextLine
CONVERTING "0123456789" TO "5298317046"
AFTER INITIAL "codeon" BEFORE INITIAL "codeoff".

Example 3 In this example, the INSPECT..CONVERTING is used to convert upper case letters to
lower case and visa versa.

The first INSPECT shows how we can convert the characters using string literals
and the next two show how storing the strings in data items makes the conversion
operation more versatile.

Actually, these days we can do this with the UPPER-CASE and LOWER-CASE
Intrinsic Functions.

* Data Division entries


01 AlphaChars.
02 AlphaLower PIC X(26) VALUE "abcdefghijklmnopqrstuvwxyz".
02 AlphaUpper PIC X(26) VALUE "ABCDEFGHIJKLMNOPQRSTUVWXYZ".
* Procedure Division entries
INSPECT CustAddress
CONVERTING "abcdefghijklmnopqrstuvwxyz"
TO "ABCDEFGHIJKLMNOPQRSTUVWXYZ".

INSPECT CustAddress
CONVERTING AlphaLower TO AlphaUpper.
INSPECT CustAddress
CONVERTING AlphaUpper TO AlphaLower.

The STRING verb


Introduction

Aims The aim of this unit is to provide to provide you with a solid understanding of the
STRING verb.

Objectives By the end of this unit you should:By the end of this unit you should:

1. Understand how the STRING verb works.

2. Be able to use the STRING to concatenate text strings in your programs

Prerequisites You should be familiar with the material covered in the unit;

Data Declaration

Iteration

Selection

Tables

Edited Pictures

The INSPECT verb


To top of page

The STRING verb

Introduction The STRING verb is used for string concatenation. That is, it is used to join
together the contents of two or more source strings or partial source string to
create a single destination string.

The examples below show how the STRING is used.

The first example concatenates the entire contents of the identifiers Indent1 and
Ident2 with the literal "10" and puts the resulting sting into DestString.

The second example concatenates the entire contents of Ident1, with the partial
contents of Ident2 (all the characters up to the first space) and Ident3 (all the
characters up to the word "frogs").

STRING Ident1, Ident2, "10" DELIMITED BY SIZE


INTO DestString
END-STRING

STRING
Ident1 DELIMITED BY SIZE
Ident2 DELIMITED BY SPACES
Ident3 DELIMITED BY "Frogs"
INTO Ident4 WITH POINTER StrPtr
END-STRING.

STRING syntax
Character or set of characters in a source string
Delim$il that terminates data transfer to the destination
string.

Points to the position in the destination string


Pointer#i
where the next character will go.

How the STRING The STRING statement moves characters from the source string to the destination
works string according to the rules for alphanumeric to alphanumeric moves.

However, no space filling occurs and unless characters in the destination string
are explicitly overwritten they remain undisturbed.

As usual with alphanumeric to alphanumeric MOVEs, data movement is from left


to right.

The leftmost character of the source is moved to the leftmost position of the
destination then the next leftmost of the source to the next leftmost of the
destination and so on.

When a number of source strings are concatenated, characters are moved from
the leftmost source string first.

When a WITH POINTER phrase is used, its value determines the starting character
position for insertion into the destination string.

The ON OVERFLOW clause executes if there are still valid characters left in the
source strings but the destination string is full.

Data movement Data movement from a particular source string ends when either;
termination
the end of the source string is reached

the end of the destination string is reached


the delimiter is detected.

STRING termination The STRING statement ends when either;

all the source strings have been processed

the destination string is full

the pointer points outside the string.

STRING rules 1. Where a literal can be use, a Figurative Constant can be used except the
ALL (literal).

2. When a Figurative Constant is used its size is one character.

3. The destination item Dest$i must be an elementary data item without


editing symbols or the JUSTIFIED clause.

4. Pointer#i must be an integer item and its description must allow it to


contain a value one greater than the size of the destination string. For
instance, a pointer declared as PIC 9 is too small if the destination string
is 10 characters long.

STRING clauses ON OVERFLOW


If the ON OVERFLOW clause is used then the statement following it will be
executed if there are still characters left to pass across in the source field(s) but
the destination field has been filled.

WITH POINTER
The WITH POINTER phrase allows an identifier/dataname to be kept which holds
the position in the Destination String where the next character will go.

When the WITH POINTER phrase is used, the program must set the pointer to an
initial value greater than 0 and less than the length of the destination string before
the STRING statement executes.

If the WITH POINTER phrase is not used, operation on the destination field starts
from the leftmost position.
DELIMITED BY SIZE

The DELIMITED BY SIZE clause means that the whole of the sending field will be
added to the destination string.

To top of page

STRING examples

Introduction This section starts with some examples to reinforce the material you have just
read and it ends with a few test/examples to let you see if you have understood
everything so far.

Example 1 The animation in this example shows how the STRING works by stepping
(animation) though each clause in the STRING statement.

Click on the diagram below to step through the animation. Left click or PageDown to go to the
next step and right click or press PageUp to go to the previous step.

Example 2 The WITH POINTER phrase is often very useful because it means that we don't
(animation) have to STRING all the source strings together in one go.

Using the WITH POINTER we can build the destination string by executing a
number of separate STRING statements or by executing a STRING statement a
number of times.

In this example we show how a destination string may be built a piece at a time
by executing several separate STRING statements. At the end of this unit in the
combined example you will see how a STRING statement may be used within a
loop to build a destination string.

Click on the diagram below to step through the animation. Left click or PageDown to go to the
next step and right click or press PageUp to go to the previous step.

Self assessment The examples below consist of data delarations and a number of STRING
questions/examples statements.

For each STRING example, write out the text that would be displayed by the
DISPLAY statement. Assume that the data starts off fresh for each STRING
statement.

Treat the examples as an opportunity to test your understanding of the STRING


verb. Before you click on the answers in the frame below write down your own
answer.

Data Declarations.

01 StringFields.
02 Field1 PIC X(18) VALUE "Where does this go".
02 Field2 PIC X(30) VALUE "This is the destination string".
02 Field3 PIC X(15) VALUE "Here is another".
02 Field4 PIC X(15) VALUE SPACES.

01 StrPointers.
02 StrPtr PIC 99.
02 NewPtr PIC 9.

STRING examples.

1. STRING Field1 DELIMITED BY SPACES INTO Field2.


DISPLAY Field2.

2. STRING Field1 DELIMITED BY SIZE INTO Field2.


DISPLAY Field2.

3. MOVE 6 TO StrPtr.
STRING Field1, Field3 DELIMITED BY SPACE
INTO Field2 WITH POINTER StrPtr
ON OVERFLOW DISPLAY "String Error"
NOT ON OVERFLOW DISPLAY "No Error"
END-STRING.
DISPLAY Field2.

4. STRING Field1, Field2, Field3


DELIMITED BY SPACES INTO Field4
END-STRING.
DISPLAY Field4

5. MOVE 4 TO NewPtr.
STRING Field1 DELIMITED BY "this"
Field3 DELIMITED BY SPACE
"END" DELIMITED BY SIZE
INTO Field2
END-STRING.
DISPLAY Field2.

6. MOVE 4 TO NewPtr.
STRING Field1 DELIMITED BY "this"
Field3 DELIMITED BY SPACE
"Tom" DELIMITED BY SIZE
INTO Field2 WITH POINTER NewPtr
ON OVERFLOW DISPLAY "String Error"
NOT ON OVERFLOW DISPLAY "No Error"
END-STRING.
DISPLAY Field2.

Answers to SAQs Left click or PageDown to go to the next answer and right click or press PageUp to go to the
previous answer.

The UNSTRING verb


Introduction

Aims The aim of this unit is to provide to provide you with a solid understanding of the
UNSTRING verb.

Objectives By the end of this unit you should:By the end of this unit you should:

1. Understand how the UNSTRING works

2. Be able to use the UNSTRING to divide a string into sub-strings

Prerequisites You should be familiar with the material covered in the unit;

Data Declaration

Iteration

Selection

Tables

Edited Pictures

The INSPECT verb

The STRING verb

To top of page
The UNSTRING verb

Introduction The UNSTRING is used to divide a string into sub-strings.

The examples below show how the UNSTRING is used.

The first example breaks a name string into its three constituent part - the first
name , second name and surname. For instance the string "John Joseph Ryan" is
broken into the three strings "John", "Joseph" and "Ryan".

The second example breaks an address string (where the parts of the address are
separated from one another by commas) into separate address lines. The address
lines are stored in a six element table.

Since not all addresses will have six parts exactly, we use the TALLYING clause to
discover how many parts there are.

UNSTRING FullName DELIMITED BY ALL SPACES


INTO FirstName, SecondName, Surname
END-UNSTRING.

UNSTRING CustAddress DELIMITED BY ","


INTO AdrLine(1), AdrLine(2), AdrLine(3),
AdrLine(4), AdrLine(5), AdrLine(6)
TALLYING IN AdrLinesUsed
END-UNSTRING.
UNSTRING syntax

Character or set of characters in the source


Delim$il string that terminate data transfer to a
particular destination String

Holds the delimiter that caused data transfer


HoldDelim$i to a particular destination string to
terminate.

Is associated with a particular destination


CharCounter#i string and holds a count of the characters
copied into it.

Points to the position in the source string


Pointer#i
from which the next character will be taken.

Holds the count of the number of


DestCounter#i destination strings affected by the
UNSTRING operation.

How the UNSTRING The UNSTRING copies characters from the source string, to the destination string,
works until a condition is encountered that terminates data movement.

When data movement ends for a particular destination string, the next destination
string becomes the receiving area and characters are copied into it until once
again a terminating condition is encountered.

Characters are copied from the source string to the destination strings according
to the rules for alphanumeric moves. There is space filling.

Data movement When the DELIMITED BY clause is used data movement from the source string to
termination the current destination string ends when either ;

a delimiter is encountered in the source string

the end of the source string is reached.

When the DELIMITED BY clause is not used, data movement from the source
string to the current destination string ends when either;

the destination string is full

the end of the source string is reached

UNSTRING The UNSTRING statement terminates when either;


termination
All the characters in the source string have been examined

All the destination strings have been processed

Some error condition is encountered (such as the pointer pointing


outside the source string).

UNSTRING rules 1. Where a literal can be used any figurative constant can be used except the
ALL (literal).

2. When a figurative constant is used then its length is one character.

3. Characters are moved from the source string to the destination strings
according to the rules for the MOVE, with space filling if required.
4. The delimiter is moved into HoldDelim$i according to the rules for the
MOVE.

5. The DELIMITER IN and COUNT IN phrases may be specified only if the


DELIMITED BY phrase is used.

UNSTRING clauses ON OVERFLOW


The ON OVERFLOW is activated if :-

The unstring pointer (Pointer#i) is not pointing to a character position


within the source string when the UNSTRING executes.

All the destination strings have been processed but there are still valid
unexamined characters in the source string.

The statements following the NOT ON OVERFLOW are executed if the UNSTRING
is about to terminate successfully.

COUNT IN
The COUNT IN clause is associated with a particular destination string and holds a
count of the number of characters passed to the destination string.

TALLYING IN
Only one TALLYING clause can be used with each UNSTRING. It holds a count of
the number of destination strings affected by the UNSTRING operation.

WITH POINTER
When the WITH POINTER clause is used the Pointer#i holds the position of the
next non-delimiter character to be examined in the source string.

Pointer#i must be large enough to hold a value one greater than the size of the
source string.

ALL
When the ALL phrase is used, contiguous delimiters are treated as if only one
delimiter had been encountered.

If the ALL is not used, contiguous delimiters will result in spaces being sent to
some of the destination strings

DELIMITER IN
A DELIMITER IN clause is associated with a particular destination string.
HoldDelim$i holds the delimiter that was encountered in the source string.

If the DELIMITER IN phrase is used with the ALL phrase then only one occurrence
of the delimiter will be moved to HoldDelim$i.

DELIMITED BY
When the DELIMITED BY phrase is used, characters are examined in the source
string and transferred to the current destination string until one of the specified
delimiters is encountered or the end the source string is reached.

If there is not enough room in the destination string to take all the characters that
are sent to it from the source string then the remaining characters will be
truncated/lost.

When the delimiter is encountered in the source string, the next destination string
becomes current and characters are transferred into it, from the source string.

Delimiters are not transferred or counted in CharCounter#i.

To top of page

UNSTRING examples

Introduction This section starts with 6 short examples exploring various aspects the
UNSTRING.

Example 7 is a longer and more practical example

Finally the section ends with an example program that uses the UNSTRING.

Examples 1 - 6 When you view these examples start by carefully examining the UNSTRING
(animation) statement. See if you can figure out what it's going to do (you may need to review
the material above for this).

When you are happy with your prediction step through the example and see if
you were correct.

If your prediction is not correct try to discover what misunderstanding has led to
your error.

Left click or PageDown to go to the next frame and right click or press PageUp to go to the
previous frame.

Example 7 The WITH POINTER phrase is often very useful because it means that we don't
(animation) have to unpack the whole source string in one go..

In this example, we unpack a full name, any number of Christian names followed
by a surname, and display the names one after another.

Click on the diagram below to step through the animation. Left click or PageDown to go to the
next step and right click or press PageUp to go to the previous step.

Example program Comma delimited or comma separated values (CSV) is a file format widely used
in the computing industry.

It allows data to be transferred between applications with incompatible file


formats (Excel, for example, can save its spreadsheets in this form).

But before the records in a CSV format file can be processed they must be
unpacked into individual fields.

In this example, a file of customer records is held in file with a CSV-like format.
Each record contains a customer name, a customer address and the customer
balance. The fields are separated from one another by commas. The individual
parts of the customer address are separated by the slash "/" character.

An example record is -

Michael Ryan,3 Winchester Drive/Castletroy/Limerick/Ireland,0022456

The program below reads the file, unpacks each the record into separate fields
and writes the unpacked record to a new file.

IDENTIFICATION DIVISION.
PROGRAM-ID. CDFILE.
AUTHOR. Michael Coughlan.
ENVIRONMENT DIVISION.
INPUT-OUTPUT SECTION.
FILE-CONTROL.
SELECT CommaDelimitedFile ASSIGN TO
"CDFILE.DAT".
SELECT CustomerFile ASSIGN TO
"CustFile.DAT".

DATA DIVISION.
FILE SECTION.
FD CommaDelimitedFile.
01 CommaDelimitedRec PIC X(205).
88 EndOfFile VALUE HIGH-VALUES.

FD CustomerFile.
01 CustomerRec.
02 CustName PIC X(40).
02 AddrLinesUsed PIC 9.
02 CustAddress.
03 AddrLine PIC X(25) OCCURS 1 TO 6
DEPENDING ON AddrLinesUsed.
02 CustBalance PIC 9(5)V99.

WORKING-STORAGE SECTION.
01 TempAddress PIC X(150).
01 TempBalance PIC X(7).
01 AdjustedBalance REDEFINES TempBalance PIC
9(5)V99.

PROCEDURE DIVISION.
Begin.
OPEN INPUT CommaDelimitedFile
OPEN OUTPUT CustomerFile
READ CommaDelimitedFile
AT END SET EndOfFile TO TRUE
END-READ

PERFORM UNTIL EndOfFile


MOVE ZEROS TO AddrLinesUsed
UNSTRING CommaDelimitedRec DELIMITED BY
","
INTO CustName, TempAddress,
TempBalance
UNSTRING TempAddress DELIMITED BY "/"
INTO AddrLine(1), AddrLine(2),
AddrLine(3),
AddrLine(4), AddrLine(5),
AddrLine(6)
TALLYING IN AddrLinesUsed
MOVE AdjustedBalance TO CustBalance
WRITE CustomerRec
READ CommaDelimitedFile
AT END SET EndOfFile TO TRUE
END-READ
END-PERFORM
CLOSE CommaDelimitedFile, CustomerFile
STOP RUN.

To top of page

Combined example

Introduction This example takes a string, containing a name consisting of any number of
Christian names followed by a surname, and converts it by reducing the Christian
names to their first letters followed by a period.

For example "Michael John Tim James Ryan" becomes "M.J.T.J. Ryan".

01 OldName PIC X(80).


Data items used in 01 TempName.
the example 02 NameInitial PIC X.
02 FILLER PIC X(15).

01 NewName PIC X(30).

01 Pointers.
02 StrPtr PIC 99 VALUE 1.
02 UnstrPtr PIC 99 VALUE 1.
88 NameProcessed VALUE 81.

Click on the diagram below to step through the animation. Left click or PageDown to go to the
Animation next step and right click or press PageUp to go to the previous step.
Reference Modification and Intrinsic Functions

Introduction

Aims The aim of this unit is to provide to provide you with a solid understanding of
string manipulation using Reference Modification. It will also introduce you to
Intrinsic Functions and will show how strings may be manipulated using the the
String Intrinsic Functions.

Objectives By the end of this unit you should:

1. Understand how Reference Modification works

2. Be able to use Reference Modification to extract and insert sub-strings.

3. Understand how Intrinsic Functions work.

4. Be able to using Intrinsic Functions for string and date manipulation.


The COBOL Report Writer

Introduction

Aims The aim of this unit is to provide you with a solid understanding of COBOL
Report Writer. This unit introduces the report writer by -

1. Looking at a fairly complex report created using the report writer and
examining what the Report Writer had to do to produce the report.

2. Looking at the Procedure Division of the program that produced the


report.

3. Writing a simple version of the the program that produced the report.

4. Adding more features to the report program.

5. Examining the use of Declaratives in extending the functionality of the


Report Writer.

6. Examining a full report program that uses Declaratives.

In the next unit we examine the syntax and semantics of the Report Writer clauses
and verbs in detail.
Introduction

Aims The unit provides a Report Writer reference. The syntax elements of the Report
Writer are presented and the semantics of their operation discussed.

Objectives By the end of this unit you should -

1. Know how to set up a report file in the Environment Division and the File
Section.

2. Be able to define a report and the appropriate report groups in the Report
Section.

3. Understand how Declaratives may be used to extend the functionality of


the Report Writer.

4. Be able to write the Procedure Division code (using the INITIATE,


GENERATE and TERMINATE verbs) to create a report.

Prerequisites You should be familiar with the material covered in the units;

Sequential files

Data Declaration

The Report Writer


Defining the Report - Report File entries

Environment Division Just like ordinary reports, the reports generated by the Report Writer are written to
entries an external device - usually a report file.

The Environment Division entries for a report file are the same as those for an
ordinary file. The same Select and Assign clauses apply.

For instance, in the program fragment below, the Select and Assign for the final
example program (given in the previous Report Writer unit) is shown.

When the Report Writer generates this report it will be stored in the file
SALESREPORT.LPT.

ENVIRONMENT DIVISION.
INPUT-OUTPUT SECTION.
FILE-CONTROL.
SELECT SalesFile ASSIGN TO "GBPAY.DAT"
ORGANIZATION IS LINE SEQUENTIAL.
SELECT PrintFile ASSIGN TO "SALESREPORT.LPT".

DATA DIVISION.
FILE SECTION.
FD SalesFile.
01 SalesRecord.
88 EndOfFile VALUE HIGH-VALUES.
02 CityCode PIC 9.
02 SalesPersonNum PIC 9.
02 ValueOfSale PIC 9(4)V99.

FD PrintFile
REPORT IS SalesReport.

: intervening entries :
: intervening entries :

REPORT SECTION.
RD SalesReport
CONTROLS ARE FINAL
CityCode
SalesPersonNum
PAGE LIMIT IS 66
HEADING 1
FIRST DETAIL 6
LAST DETAIL 42
FOOTING 52.

File Section entries Although the entries in the Environment Division are the same as those for
ordinary print files, in the File Section the normal file description is replaced by
phrase which points to the Report Description (RD) in the Report Section. The
syntax of the phrase is -

The ReportName must be the same as the ReportName used in the Report
Description (RD) entry.

You can see this in the program fragment above where the REPORT IS
SalesReport phrase links the PrintFile with the Report Description(RD) in the
Report Section.

Notes:

Before the report can be used it must be opened for output. For instance in the
example above the PrintFile must be opened for output before the SalesReport can
be generated.

Defining the Report - Report Description (RD) Entries

Introduction Every report generated by the Report Writer must have a Report Description (RD)
entry in the Report Section.

The RD entry names the report and specifies the format of the printed page and
identifies the control break items.

Each RD entry is followed by one or more 01 level-number entries. Each 01


level-number entry identifies a report group and consists of a hierarchical
structure similar to a COBOL record.

Each report group is a unit consisting of one or more print lines and cannot be
split across pages.

The Report The syntax for the Report Description (RD) entries is shown below -
Description (RD)
entries - Syntax

The Report RD Rules


Description (RD)
entries - Semantics 1. The ReportName can only appear in one RD entry.

2. When more than one report is declared in the Report Section the
ReportName can be used to qualify the LINE-COUNTER and PAGE-
COUNTER report registers.

CONTROL Rules

1. The ControlName$#i must not be defined in the Report Section.

2. Each occurrence of ControlName$#i must identify a different data item.

3. The ControlName$#i must not have a variable length table subordinate to


it.

4. ControlName$#i and FINAL specify the levels of the control break


hierarchy where FINAL (if specified) is the highest and the first
ControlName$#i the next highest and so on.

5. When the value in any ControlName$#i changes a control break occurs.


The level of the control break depends on the position of the
ControlName$#i in the control break hierarchy.

PAGE Rules

1. HeadingLine#l must be greater than or equal to 1.

2. HeadingLine#l <= FirstDetailLine#l <= LastDetailLine#l <=


FootingLine#l <= PageSize#l

3. Line numbers used in a Report Heading or Page Heading group must be


greater than or equal to HeadingLine#l and less than FirstDetailLine#l but
when a Report Heading appears on a page by itself any line number
between HeadingLine#l and PageSize#l may be used.

4. Line numbers used in Detail or Control Heading groups must be in the


range FirstDetailLine#l to LastDetailLine#l inclusive.

5. Line numbers used in Control Footing groups must be in the range


FirstDetailLine#l FootingLine#l inclusive

6. Line numbers used in the Report Footing or Page Footing groups must be
greater than FootingLine#l and less than or equal to PageSize#l but when a
Report Footing appears on a page by itself any line number between
HeadingLine#l and PageSize#l may be used.

7. All report groups must be defined so that they can be presented on one
page. The Report Writer never splits a multi-line group across page
boundaries.

Defining the Report - Report Group entries


Introduction In the Report Section each report group is represented by a report group record.
As with all record descriptions in COBOL a report group record starts with level
number 01. Subordinate items in the record describe the report lines and columns
within the report group.

The description of a report group consists of a number of hierarchic levels. At the


top must be the Report Group Definition.

Defining a Report Report Group Definition


Group Record -
Syntax
ReportGroupName The ReportGroupName is required only when the group -
Rules
1. is a detail group referenced by a GENERATE statement or the UPON phrase
of a SUM clause.

2. is referenced in a USE BEFORE REPORTING sentence in the Declaratives

3. is required to qualify the reference to a sum counter.

The LINE NUMBER The LINE NUMBER clause is used to specify the vertical positioning of print lines.
clause Lines can be printed on -

o a specified line (absolute)

o a specified line on the next page (absolute)

o the current line number plus some increment (relative)

Rules:

1. The LINE NUMBER clause specifies where each line is to be printed so no


item that contains a LINE NUMBER clause may contain a subordinate item
that also contains a LINE NUMBER clause (subordinate items specify the
column items).

2. Where absolute LINE NUMBER clauses are specified all absolute clauses
must precede all relative clauses and the line numbers specified in the
successive absolute clauses must be in ascending order.

3. The first LINE NUMBER clause specified in a PAGE FOOTING group must be
absolute.

4. The NEXT PAGE clause can only appear once in a given report group
description and it must be in the first LINE NUMBER clause in the report
group.

5. The NEXT PAGE clause cannot appear in any HEADING group.

The NEXT GROUP The NEXT GROUP clause is used to specify the vertical positioning of the start of
clause the next group. The NEXT GROUP clause can be used to specify that the next report
groups should be printed on -

o a specified line (absolute)

o the current line number plus some increment (relative)

o the next page

Rules

The NEXT PAGE option in the NEXT GROUP clause must not be specified in a page
footing.

The NEXT GROUP clause must not be specified in a REPORT FOOTING or PAGE
HEADING group.

When used in a DETAIL group the NEXT GROUP clause refers to the next DETAIL
group to be printed.

The TYPE clause The TYPE clause specifies the type of the report group. The type of the report
group governs when and where will be printed in the the report (for instance, a
REPORT HEADING group is printed only once - at the beginning of the report).

Rules

Most groups are defined once for each report but control groups (other than
CONTROL ..FINAL groups) are defined for each control break item.

In REPORT FOOTING, and CONTROL FOOTING groups, SOURCE and USE clauses
must not reference any data item which contains a control break item or is
subordinate to a control break item.

PAGE HEADING or FOOTING groups must not reference a control break item or any
item subordinate to a control break item.

DETAIL report groups are processed when they are referenced in a GENERATE
statement. All other groups are processed automatically by the Report Writer.
There can be more than one DETAIL group.

The REPORT HEADING, PAGE HEADING, CONTROL HEADING FINAL, CONTROL


FOOTING FINAL, PAGE FOOTING and REPORT FOOTING report groups can each
appear only once in the description of a report.

Defining the Report - Lines and Columns

Introduction The subordinate items in a report group record describe the report lines and
columns within the report group. There are two formats for defining items
subordinate to the report group record. The first is usually used to define the lines
of the report group and the second to define and position the elementary print
items.

Defining Report Print lines in a report group are usually defined using the format shown below.
Lines This format is used to specify the vertical placement of a print line and it is
always followed by subordinate items that specify the columns where the data
items are to be printed.

As shown in the syntax diagram below the level number is from 2 to 48 inclusive.

If the ReportLineName is used its only purpose is to qualify a sum counter


reference.

Defining Elementary Elementary print items in the print line of a report group are described using the
print items in a report syntactic elements shown in the diagram below.
group
As we can see from the syntax diagram, the normal data description clauses such
as PIC, USAGE, SIGN, JUSTIFIED, BLANK WHEN ZERO and VALUE may be applied
when describing an elementary print item. The Report Writer provides a number
of additional clauses that may also be used.
The PrintItemName can only be referenced if the entry uses the SUM clause to
define a sum counter.

The COLUMN The COLUMN NUMBER specifies the position of a print item on the print line.
NUMBER clause When this clause is used it must be subordinate to an item that contains a LINE
NUMBER clause.

Within a given print line the ColNum#l's should be in ascending sequence.

ColNum#l specifies the column number of the leftmost character position of the
print item.

The GROUP The GROUP INDICATE is used to specify that a print item should be printed only on
INDICATE clause the first occurrence of its report group after a control break or page advance.
For instance in the full version of the example program the CityName and
SalesPersonNum are suppressed after their first occurrence.

01 DetailLine TYPE IS DETAIL.


02 LINE IS PLUS 1.
03 COLUMN 1 PIC X(9)
SOURCE CityName(CityCode) GROUP INDICATE.
03 COLUMN 15 PIC 9
SOURCE SalesPersonNum GROUP INDICATE.
03 COLUMN 25 PIC $$,$$$.99 SOURCE ValueOfSale.

The GROUP INDICATE clause can only appear in a DETAIL report group

The SOURCE clause The SOURCE clause is used to identify a data item that contains the value to be
used when the print-item is printed. For instance, the SOURCE ValueOfSale
clause in the example above specifies that the value of the item to be printed in
column 25 is to be found in the data-item ValueOfSale.

The SUM clause The SUM clause is used both to establish a sum counter and to name the data-
items to be summed.

Three forms of summing can be done in the Report Writer;

1. Subtotalling

2. Rolling Forward

3. Cross footing

Subtotalling

In Subtotalling, each time a GENERATE statement is executed the value to be


summed is added to the sum counter.

Rolling Forward
In Rolling Forward the value accumulated in the sum counter of one group is used
as the value to be summed in the sum counter of another group.

The full version of the example program contains a good example of both
Subtotalling and Rolling forward. The SUM clause is used to accumulate the
ValueOfSale into a sum counter named as SMS (Subtotalling). The SMS sum
counter is used as the value to be summed into the CS sum counter and CS is used
as the value to be summed into the final total (Rolling Forward).

In the example code fragment below, each time a DETAIL line is GENERATED the
ValueOf Sale is added to the SMS sum counter. When a control break occurs on
SalesPersonNum the accumulated sum is printed and is added to the CS sum
counter. When a control break occurs on the CityCode the sum accumulated in the
CS sum counter is added to the final total sum counter.

01 SalesPersonGrp
TYPE IS CONTROL FOOTING SalesPersonNum NEXT GROUP PLUS 2.
: other entries :
: other entries :
03 SMS COLUMN 45 PIC $$$$$,$$$.99 SUM ValueOfSale.

01 CityGrp TYPE IS CONTROL FOOTING CityCode NEXT GROUP PLUS 2.


: other entries :
: other entries :
03 CS COLUMN 45 PIC $$$$$,$$$.99 SUM SMS.

01 TotalSalesGrp TYPE IS CONTROL FOOTING FINAL.


: other entries :
: other entries :
03 COLUMN 45 PIC $$$$$,$$$.99 SUM CS.

Cross Footing

In Cross Footing sum counters in the same report group can be added together. In
the example below, each time a GENERATE statement is executed the value of
the SalesPersonSale is added to the SPS sum counter and the value of
CounterSale is added to SCS (Subtotalling). When a control break occurs on
ShopNum the values of the SPS and SCS sum counters are added together to give
the combined total printed in column 60 (Cross Footing).

01 ShopTotalsGrp
TYPE IS CONTROL FOOTING ShopNum NEXT GROUP PLUS 2.
: other entries :
: other entries :
03 SPS COLUMN 20 PIC $$$,$$$.99 SUM SalesPersonSale.
03 SCS COLUMN 40 PIC $$$,$$$.99 SUM CounterSale.
03 COLUMN 60 PIC $$$$,$$$.99 SUM SPS, SCS.

Rules

1. If the the SUM..UPON ReportGroupName clause is used then the value is


only added to the sum counter when the specified report group is printed.

2. A SUM clause can appear only in the description of a CONTROL FOOTING


report group.

3. Statements in the Procedure Division can be used to alter the contents of


the sum counters.

The RESET ON Sum counters are normally reset to zero after a control break on the control break
clause item associated with the report group but the RESET clause can be used to reset the
sum counters on the break of a more senior control item.

Rules

1. The ControlBreakItem#$i must be one of the data items mentioned in the


CONTROL/CONTROLS clause in the Report Description.

Special Report Writer registers

Introduction The Report Writer maintains two special registers for each report declared in the
Report Section. The two registers are -

the LINE-COUNTER
and
the PAGE-COUNTER

The LINE-COUNTER The reserved word LINE-COUNTER can be used to access a special register that the
register Report Writer maintains for each report in the Report Section.

The Report Writer uses the LINE-COUNTER register to keep track of where the
lines are being printed on the report. It uses this information and the information
specified in the PAGE LIMIT clause in the RD entry to decide when a new page is
required.

While the LINE-COUNTER register can be used as a SOURCE item in the report no
statements in the Procedure Division can alter the value in the register.

References to the LINE-COUNTER register can be qualified by referring to the


name of the report given in the RD entry.

The PAGE-COUNTER The reserved word PAGE-COUNTER can be used to access a special register that the
register Report Writer maintains for each report in the Report Section.

The PAGE-COUNTER is used to count the number of pages in the report.

The PAGE-COUNTER register can be used as a SOURCE item in the report but the
value of the PAGE-COUNTER may also be changed by statements in the Procedure
Division.

01 TYPE IS PAGE FOOTING.


02 LINE IS 53.
03 COLUMN 1 PIC X(29)
VALUE "Programmer - Michael Coughlan".
03 COLUMN 45 PIC X(6) VALUE "Page :".
03 COLUMN 52 PIC Z9 SOURCE PAGE-COUNTER.

Procedure Division verbs

Introduction The Report Writer introduces four new verbs for processing reports. These are -

1. INITIATE

2. GENERATE
3. TERMINATE

4. SUPPRESS PRINTING

The first three are normal Procedure Division verbs but the last can only be used
in the Declaratives Section.

The normal Procedure Division verbs will be covered in this section. Please see
the section on Declaratives for the SUPPRESS PRINTING statement.

Syntax

INITIATE The INITIATE statement starts the processing of the ReportName report or reports.

The ReportName must be the name defined for a report in the RD entry in the
Report Section.

The INITIATE statement initializes -

all the report's sum counters to zero

the report LINE-COUNTER to zero

the report PAGE-COUNTER to one

Before the INITIATE statement is executed the file associated with the Report must
have been opened for OUTPUT or EXTEND.
GENERATE The GENERATE statement drives the production of the report.

The target of a GENERATE statement is either a DETAIL report group or a Report


Name.

When the target is a ReportName then the report description must contain -

a CONTROL clause

not more than one DETAIL group

at least one group that is not a PAGE or REPORT group.

When all the GENERATE statements for a particular report target the ReportName
then the report performs summary processing only and the report produced is
called a summary report. For instance to make a summary report from the
example report all we have to to is to change the GENERATE statement from -

GENERATE DetailLine.

to

GENERATE SalesReport.

If we specify GENERATE SalesReport then the DETAIL group is never printed but
the other groups are printed (see the first page of the summary report shown
below).

An example COBOL Report Program


Bible Salesperson - Sales and Salary Report

City Salesperson Sale


Name Number Value
Sales for salesperson 1 = $334.00
Sales commission is = $16.70
Salesperson salary is = $139.70
Current salesperson number = 2
Previous salesperson number = 1

Sales for salesperson 2 = $3,667.50


Sales commission is = $183.38
Salesperson salary is = $504.38
Current salesperson number = 3
Previous salesperson number = 2

Sales for salesperson 3 = $777.70


Sales commission is = $38.89
Salesperson salary is = $473.89
Current salesperson number = 1
Previous salesperson number = 3

Sales for Dublin = $4,779.20


Current city = 2
Previous city = 1

Sales for salesperson 1 = $334.00


Sales commission is = $16.70
Salesperson salary is = $139.70
Current salesperson number = 2
Previous salesperson number = 1

Sales for salesperson 2 = $2,334.00


Sales commission is = $116.70
Salesperson salary is = $458.70
Current salesperson number = 3
Previous salesperson number = 2

Sales for salesperson 3 = $222.20


Sales commission is = $11.11
Salesperson salary is = $122.11
Current salesperson number = 4
Previous salesperson number = 3

Programmer - Michael Coughlan Page : 1

TERMINATE The TERMINATE statement instructs the Report Writer to complete the processing
of the specified report. The Report Writer prints the Page and Report Footings and
all the CONTROL FOOTING groups are printed as if there had been a control break
on the most senior control group.

After the report has been terminated the file associated with the report must be
closed. For instance in the example program the TERMINATE SalesReport
statement is followed by the CLOSE PrintFile statement.

Report Writer Declaratives

Introduction Declaratives are used to extend the functionality of the Report Writer when its
standard operations are insufficient of create the desired report. Declaratives
extend the functionality of the Report Writer by performing tasks or calculations
that the Report Writer can not do automatically or by selectively stopping the
report group from being printed (using the SUPPRESS PRINTING command).

When Declaratives are used with the Report Writer the USE BEFORE REPORTING
phrase allows a programmer to specify that the particular section of code is to be
executed before some report group is printed.

Declaratives may also be used to handle file operation errors. In this case the USE
clause should use the following syntax -

Using Declaratives When Declaratives are used they must appear as the first item in the Procedure
with the Report Division.
Writer
The Declaratives must be divided into sections and each section must be
associated with a particular USE statement.
The USE statement governs when a particular declarative section is executed.

When Declaratives are used with the Report Writer the USE syntax shown below
must be used

Rules:

The ReportGroupName must not appear in more than one USE statement.

The GENERATE, INITIATE and TERMINATE statements must not appear in the
Declaratives.

The value of any control data items must not be altered in the Declaratives.

Statements in the Declaratives must not reference non-declarative procedures.

PROCEDURE DIVISION.
Declaratives Example DECLARATIVES.
Calc SECTION.
USE BEFORE REPORTING SalesPersonGrp.
Calculate-Salary.
MULTIPLY SMS BY Percentage
GIVING Commission ROUNDED.
ADD Commission, FixedRate(CityCode,SalesPersonNum)
GIVING Salary.
END DECLARATIVES.

Main SECTION.
Begin.
OPEN INPUT SalesFile.
OPEN OUTPUT PrintFile.
: etc :
: etc :

The SUPPRESS The SUPPRESS PRINTING statement is used in a Declarative section to stop a
PRINTING statement particular report group from being printed.

The report group suppressed is the one mentioned in the USE statement
associated with the section containing the SUPPRESS PRINTING statement.

The SUPPRESS PRINTING statement must be executed each time you want to stop
the report group from being printed.

In the example below we only want to print the ShopTotalGrp if the sales for the
shop are greater than or equal to 10,000.

PROCEDURE DIVISION.
DECLARATIVES.
CheckShopSales SECTION.
USE BEFORE REPORTING ShopTotalGrp.
PrintMajorShopsOnly.
IF SalesTotal < 10000
SUPPRESS PRINTING
END-IF.
END DECLARATIVES.

Control Break A control break occurs when the value in a control break item changes. This
Registers causes an interesting problem when the control break item is used as the SOURCE
value in a control footing because by the time the control footing is printed, the
control break item no longer contains the correct value. When we print a control
footing and refer to a control break item we normally want to access the previous
value of the item not the current one.

The Report Writer deals with this problem by maintaining a special control break
register for each control break item mentioned in the CONTROLS ARE phrase in the
RD entry.

When a control break item is referred to in a control footing or in the Declaratives


the Report Writer supplies the value held in the control break register and not the
value in the item itself.

If a control break has just occurred the value in the control break register is the
previous value of the control break item.

Você também pode gostar