Você está na página 1de 43

FILOR

FILE ORGANIZATION INTRODUCTION

FILE DESIGN The study of file structures involves the investigation of the data structures used to organize a large collection of data into one or more external files that are stored on secondary storage devices. FILE - is a collection of related data

Example: payroll file

FILE ORGANIZATION -refers to the way in which records are stored in an external file - refers to the data structures used for organizing the data

FOUR COMMON FILE ORGANIZATIONS


1. Sequential 2. Random 3. Indexed sequential 4. Multikey

SEQUENTIAL FILE ORGANIZATION - records are written consecutively - file are stored in ascending or descending order according to a key field. ADVANTAGE: - easier to maintain than other organizations especially in terms of adding and deleting records

RANDOM FILE ORGANIZATION - implies a predictable relationship between the key used to identify an individual record and that record's location in an external file.

INDEXED SEQUENTIAL FILE ORGANIZATION combines sequential access and ordering with the capabilities of random access. TWO PARTS OF INDEXED SEQUENTIAL FILE: 1. A collection of records stored in contiguous locations within blocks in a relative file and ordered by a key field. 2. An index (a hierarchical structure of record keys and relative block numbers) to the file of ordered records.

MULTIKEY FILE ORGANIZATION allows access to a data file by several different key fields. Example: Library file that requires access by author and by subject matter and title. Multikey organization is being implemented using B-trees.

DATA FILE TYPES


1. Master file 2. Transaction file 3. Table file 4. Report file 5. Control file 6. History file

MASTER FILE contains records of permanent data that are updated by adding, deleting, or changing Example: Payroll master file contains an employee's social security number, the rate pay, marital status, number of exemptions claimed, and year-to-date deductions and earnings. TRANSACTION FILE contains records of changes, additions, and deletions made to a master file.

TABLE FILE consists of a table of data, such as a price list, a tax rate table, or some other form or reference data that is static and is referenced by one of the other type of files.
REPORT FILE contains information that has been prepared for the user. CONTROL FILE

is small and contains information concerning a particular maintenance run, such as the date of the run; the number of master records read, added, deleted, and written; and the number of transaction records read, processed and in error

HISTORY FILE consists of all the backup master files, transaction files, and control files from past runs.

FILE CHARACTERISTICS
1. Activity of a file

is a measure of the percentage of existing master records changed during a maintenance run.
2. Volatility of a file is a measure of the number of records added and deleted compared to the original number of records.

FILE MANIPULATION
1. Queries
involve searching a file for records containing certain values in particular key fields.

2. Merging

TYPES OF FILE ORGANIZATION

Serial Sequential Indexed Sequential Direct Access /Random Access

Serial File Organization

A collection of records No particular sequence Cannot be used as master Used as temporary transaction file Records stored in the order received

Sequential File Organization

A collection of records Stored in key sequence Adding/deleting record requires making new file Used as master files

Sequential file

Advantages

Simple file design Very efficient when most of the records must be processed e.g. Payroll Very efficient if the data has a natural order Can be stored on inexpensive devices like magnetic tape.

Disadvantages

Entire file must be processed even if a single record is to be searched. Transactions have to be sorted before processing Overall processing is slow

Direct (Random) File Organization


Records are read directly from or written on to the file. The records are stored at known address. Address is calculated by applying a mathematical function to the key field.

Direct (Random) File Organization

A random file would have to be stored on a direct access backing storage medium e.g. magnetic disc, CD, DVD Example : Any information retrieval system. Eg Train timetable system.

Advantages

Any record can be directly accessed. Speed of record processing is very fast. Up-to-date file because of online updating. Concurrent processing is possible.

Disadvantages

More complex than sequential Does not fully use memory locations More security and backup problems

Indexed sequential file


Each record of a file has a key field which uniquely identifies that record. An index consists of keys and addresses. An indexed sequential file is a sequential file (i.e. sorted into order of a key field) which has an index. A full index to a file is one in which there is an entry for every record.

Indexed sequential file

Indexed sequential file

Indexed sequential files are important for applications where data needs to be accessed.....
sequentially randomly using the index.

Indexed sequential file

An indexed sequential file can only be stored on a random access device e.g. magnetic disc, CD.

Advantages

Provides flexibility for users who need both type of accesses with the same file Faster than sequential

Disadvantages

Extra storage space for the index is required

Data Transfer Speed

Problem 1: How long does it take to send 500 MB (in total) data file over a 1.5Mbps connection? (assuming ideal circumstances) time = file-size /speed

Convert to same units e.g. seconds, Mbits and Mbps. file-size = 500MB = 500 x 8(data) = 4000Mb speed = 1.5Mbps time-in-seconds = file-size / speed = 4000 / 1.5 = 2666.67 seconds = 44 minutes and 26.67 seconds

Problem 2: How fast a connection is required to transfer a 1.2GB video (in total) in 10 minutes? (assuming ideal circumstances) ?

Você também pode gostar