Você está na página 1de 6

1

Data Capture and Data Entry


Introduction
These days the majority of computer end-users input data to the computer via keyboards oPCs,
workstations or terminals. However, for many medium and large scale commercial and industrial
applications involving large volumes of data the use of keyboards is not practical or economical.
Instead, specialist methods, devices and media are used.

The selection of the best method of data entry is often the biggest single problem faced by those
designing commercial or industrial computer systems, because of the high costs involved and
numerous practical considerations.

The best methods of data entry may still not give satisfactory facilities if the necessary controls
over their use are not in place.

Problems of data entry


The data to be processed by the computer must be presented in a machine-sensible form (ie, the
language of the particular input device). Therein lies the basic problem since much data originates
in a form that is far from machine sensible. Thus a painful errorprone process of transcription
must he undergone before the data is suitable for input to the computer.

The process of data collection involves getting the original data to the processing center,
transcribing it, sometimes converting it from one medium to another, and finally getting it into the
computer. This process involves a great many people, machines and much expense.

A number of advances have been made in recent years towards automating the data collection
process so as to bypass or reduce the problems. This chapter considers a variety of methods,
including many that are of primary importance in commercial computing.

Data can originate in many forms, but the computer can only accept it in a machine-sensible
form. The process involved in getting the data from its point of origin to the computer in a form
suitable for processing is called Data Collection.

Data collection starts at the source of the raw data and ends when valid data is within the
computer in a form ready for processing.

Many of the problems of data entry can be avoided if the data can be obtained in a computer-
sensible form at the point of origin. This is known as data capture. The capture of data does not
necessarily mean its immediate input to the computer. The captured data may be stored in some
intermediate form for later entry into the main computer in the required form. If data is input
directly into the computer at its point of origin the data entry is said to be online. 1n addition,
the method of direct input is a terminal or workstation the method of input is known as Direct
Data Entry (DDE). The term Data Entry usually means not only the process of physical input by
a device but also any methods directly associated with the input.

Stages in data collection


The process of data collection may involve any number of the following stages according to the
methods used.

Data Capture and Data Entry


a. If the computer is located at a central point, the documents will be physically
2

transmitted, i.e., by the post office or a courier to the central point (e.g., posting
batches of source documents).

b. It is also possible for data to be transmitted by means of telephone lines to the central
computer, in which case no source documents would be involved in the transmission
process, (e.g., transmitting data captured at source). A variant on this method is the
use of Faxes.
c. Data preparation. This is the term given to the transcription of data from the source
document to a machinesensible medium. There are two parts; the original
transcription itself and the verification process that follows.
Note. Data Capture eliminates the need for transcription.
d. Media conversion. Very often data is prepared in a particular medium and converted to
another medium for faster input to the computer, e.g., data might be prepared on
diskette, or captured onto cassette, and then converted to magnetic tape for input. The
conversion will be done on a computer that is separate from the one to which the data
is intended.
e. Input. The data, now in magnetic form, is subjected to validity checks by a computer
program before being used for processing.
f. Sorting. This stage is required to re-arrange the data into the sequence required for
processing. (This is a practical necessity for efficient processing of sequentially
organized data in many commercial and financial applications.)

Data-collection media and methods in outline


The alternatives are as follows.
a. Online transmission of data from source, eg, Direct Data Entry (DDE).
b. Source document keyed directly into diskette (key-to-diskette) from some
documents.
c. The source document itself prepared in machine-sensible form using Character
Recognition techniques (OCR, OMR, MICR).
d. Data Capture Devices.
e. Portable encoding devices. -
f. Source data captured from Tags, Plastic Badges or strips (Barcodes).
g. Creation of data for input as a by-product of another operation.

On-line systems
The ultimate in data collection is to have the computer linked directly to the source of the data. If
this is not feasible then the next best thing is to capture the data as near as possible to its source
and feed it to the computer with little delay.

Such methods may involve the use of data transmission equipment if the point of origination is
remote from the computer. The computer is linked to the terminal point (the source of data or
nearby) by a telecommunication line and data is transmitted over the line to the computer system.

Data enters the terminal either by keying in via a keyboard or by a device such as one that can
directly read source documents.

Magnetic tape cassettes are often used for data storage. The cassettes are just like those used for
domestic audio systems.

Data entry to the device is usually by means of a small keyboard, like a calculator keyboard, or by
some special reading attachment.
3

A basic device, using only a keyboard for data entry, and able to transmit data, is effectively a
portable terminal. (Pocket PCs)

Popular attachments to both portable and static devices are the lightpen and magneticpen.
These attachments resemble pens at the end of a length of electrical flex. More bulky hand held
alternatives are sometimes called wands. They can read specially coded data in the form of
either optical marks/characters, or magnetic codes which have previously been recorded on
strips of suitable material. A common version is the bar-code reader

The use of tags as a data collection technique is usually associated with clothing retailing
applications, although they are also used to some extent in other applications.

The original tags were miniature punched cards. Today most tags in use have magnetic
strips on them instead of holes.

Using a special code, data such as price of garment, type and size, and branch/department
are recorded on the tag by a machine. Certain of the data is also printed on the tag.

Tags are affixed to the garment before sale and are removed at the point of sale. At the end
of the days trading each store will send its tags (representing the days sales) in a
container to the data processing center. Alternatively, the tags may he processed at the
point of sale.

At the center the tags are converted to more conventional diskette or magnetic tape for
input to the computer system.

Note that data is captured at the source (point of sale) in a machinesensible form and thus
needs no transcription and can be processed straightaway by the machine .

Bar-coded and magnetic strips


Data can be recorded on small strips, which are read optically or magnetically. Optical reading is
done by using printed bar codes, ie, alternating lines and spaces that represent data in binary.
Magnetic reading depends on a strip of magnetic tape on which data has been encoded. The data
are read by a light-pen, magnetic-pen or wand which is passed over the strip. Portable devices are
available that also include a keyboard. An example of their use is in stock recording; the light pen
is used to read the stock code from a strip attached to the shelf, and the quantity is keyed
manually. The data are recorded on a magnetic tape cassette. This technique is also used at
checkout points in supermarkets. Goods have strips attached and stock code and price are read by
the light pen. The data thus collected are used to prepare a receipt automatically, and are also
recorded for stock control purposes

By-product
Online methods prevent the need for physical transportation of source documents to the
processing point. There is also less delay in producing processed information, especially if the
data link provides for two way transmission of data (ie, from terminal to computer and
computer to terminal).

Such systems can involve large capital outlay on the necessary equipment, which is usually
justified in terms of speed of access to the computers data and quicker feedback of information.

Online systems are the only practical choice for some applications. One example is the
computer that controls a machine or factory process. It must receive input directly from source in
4

order to be able to respond at a moments notice.

Application. One major application is in banking (look at a cheque book), although some local
authorities use it for payment of rates by installments. Cheques are encoded at the bottom with
account number, branch code and cheque number being given to the customer. When the Cheques
are received from the customers the bottom line is completed by encoding the amount of the
cheque (ie, post-encoded). Thus all the details necessary for processing are now encoded in MIC
and the cheque enters the computer system via a magnetic ink character reader to be processed.

These devices are mostly specialpurpose devices intended for use in particular applications.
Common, special and typical examples are described in the next few paragraphs.

Direct Input Devices.


a. Special sensing devices may be able to detect events as they happen and pass the appropriate
data directly to the computer. For example:
i. On an automated Production line, products or components can be counted as
they pass specific points. Errors can stop the production line.
ii. At a supermarket checkout a laser scanner may read coded marks on food packets
as the packets pass by on the conveyer. This data is used by the computerized till
to generate a till receipt and maintain records of stock levels.
iii. In a computercontrolled chemical works, food factory or brewery industrial
instruments connected to the computer can read temperatures and pressures in
vats.

Voice data entry (VDE) devices. Data can be spoken into these devices. Currently they are
limited to a few applications in which a small vocabulary is involved.

Features:
The specific feature of these devices tends to depend upon the application for which they are
used. However, the data captured by the device must ultimately be represented in some binary
form in order to be processed by a digital computer. For some devices, the input may merely he a
single bit representation that corresponds to some instrument, such as a pressure switch, being on
or off.

Data loggers/recorders. These devices record and store data at source. The data is input to the
computer at some later stage.

Features:
a. The device usually contains its own microprocessor and data storage device/medium or radio
transmitter.

The IBM 3661) Supermarket System incorporates a high-speed optical scanner. As an item is
pulled across the scanners window a laser beam reads the European Article Number EAN (or
Universal Product Code in the US) bar code printed on the side of the package, and the system
automatically decodes and registers the information on the symbol. The item can be of any shape
and size and the bar code can be passed over the window in any direction.

Cash registers. These are fitted with magnetic tape cassette units. A mass of statistical data is
captured at source without any intermediate operation. The cassettes, etc, are forwarded to the
data processing center for input to a computer. Alternatively, the cash register may be connected
online.
5

Point-of-sale terminals
The Point-ofSale Terminal POS) is essentially an electronic cash register that is linked to a
computer, or that records data onto cassette or cartridge. In its simplest form, it may simply
transmit the details of a transaction to the computer for processing. The more complex terminals
can communicate with the computer for such purposes as checking the credit position of a
customer, obtaining prices from file and ascertaining availability of stock. If the customers bank
or credit account is debited this is EFTPOS (Electronic Fund Transfer at Point of Sale). The
terminal usually includes a keyboard for manual entry of data. A barcode reader may also be
provided, typically to read stock codes.

The type of barcoding used on packets of consumable products such as foods. The numbers are
coded in barcoded strips and printed in OCR characters.

Such details would not be asked for in an examination but serve as a good illustration of a
specialized coding system.
Factors in choice
The choice of data collection method and medium may be influenced by the following factors:

Magnetic Media such as magnetic tape and magnetic disk are primarily storage media, but are
often used at an intermediate stage of data input. For example, data may he captured onto a
diskette or magnetic cassette and then converted to magnetic tape on one computer prior to final
input to the main computer needing the data. These magnetic media are reusable and can be input
at much higher speeds than direct keying by DDE. Moreover, keyto-diskette systems
provide an advanced method of data collection, with facilities for checking and control as the data
are keyed, plus reducing the need for verification on the main computer. Tape and diskette are
relatively cheap.

Character recognition.:

MICR is largely confined to banking. It was developed in response to the need to cope with large
volumes of documents (in particular Cheques) beyond the scope of conventional methods. it is a
very reliable but expensive method.

OCR is more versatile than MICR and less expensive. It is suited to those applications that use a
turnaround system such as hilling in gas and electricity where volumes are too high for
conventional methods. It is limited to applications in which a turnaround document can he
used eg, a bill
printed by the computer, part of which is returned with the payment.

OMR is very simple and inexpensive. The forms can however be prepared only by people who
have been trained in the method. All character recognition techniques suffer from the possible
disadvantage of requiring a standardized document acceptable to the document reader.

Terminals provide a very fast and convenient means of data collection and provide the main
means of carrying out Direct Data Entry. They may also provide a lust means of output direct to
the point of use. But costs are increased by the need to provide terminals at a number of different
points and possibly by the additional use of datatransmission equipment.
Special media such as tags and barcoded strips reduce costs, but are essentially tailored to
particular types of application.

Cost - This must be an overriding factor. The elements of cost are:


a. Staff (probably the biggest).
6

b. Hardware (capital and running costs).


c. Media Paper-based source documents are not reusable and magnetic media can only be
reused
limited number of times.
d. Changeover There is normally a cost associated with changing over to a new method of
data
input.

Time. This can be quite fundamental in the choice of method and medium and is very much
linked with cost because the quicker the response required the more it generally costs to get that
response. Online systems will cut down this delay, so will methods, like OCR, that prepare
source documents in a machine-sensible form.

Accuracy. This is linked with appropriateness and confidence, and is a big headache in data
collection. Input must he clean otherwise it is rejected and delays occur. Errors at the
preparation stage also are costly. Substitution of the machine for the human is the answer in
general terms.

Volume. Some methods will not he able to cope with high volumes of source data within a
reasonable time scale.

Confidence. It is very important that a system has a record of success. This is probably why
many promising new methods take so long to be adopted.

Input medium. The choice of input medium is very much tied up with data collection. Often it
will be an integral part of data collection, eg, online systems. Key todiskette methods have
the advantage of collecting data on what is a fast input medium. These two examples are enough
to demonstrate the way in which input medium is a prime consideration when looking at the
collection of data.

Você também pode gostar