Você está na página 1de 9

Prepared By: M.

FARHANA SATHATH - MCA

UNIT-III: RELATIONAL DATABASE: Pitfalls in relational database design: i. ii. Redundant information in tuples Update anomalies

i. Redundant information in tuples The basic objectives of normalization is to reduce redundancy which means that information is to be stored only once. Storing information several times leads to wastage of storage space and increase in the total size of the data stored. Solution: Grouping attributes into relation schemas has a significant effect on the storage space. ii. Update anomalies: Anomalies is simply an error or inconsistency in the database. Poorly designed database has the following anomalie: Types (3): 1. Insertion anomalies: occurs during the Insertion of new data values to a relation. 2. Deletion anomalies: occurs during the Deletion of a tuple, namely, a row of a relation. 3. Modification anomalies: occurs during updating a value of an attribute in a tuple. FUNCTIONAL DEPENDENCIES i. Definition:

It requires that the value for a certain set of attributes determines uniquely the value for another set of attributes. in a given relation R, X and Y are attributes. Attribute Y is functionally dependent on attribute X, if each value determines exactly one value of Y, which is represented as: XY

i.e. X determines Y or Y functionally dependent on X XY does not YX

Example:

Marks Grade Rno Name

ii. Diagramatic Notations: Eid FD1 FD2 FD3 iii. Types:


a) Full Functional Dependency 1

Pnumber

Hours

Ename

Pname

Plocations

Prepared By: M.FARHANA SATHATH - MCA b) Partial Functional Dependency c) Transitive Functional Dependency

Full Functional Dependency: In a relation R, X and Y are attributes. X functionally determines Y. Subset of X should not functionally determine Y.
Student_no Marks Course-no

Marks is fully functionally dependent on Student_no and Course_no together and not on subset of {Student_no,Course_no}. i.e. Marks cannot be determined either by Student_no or Coourse-no.

Partial Functional Dependency In a relation R, X and Y are attributes. Attribute Y is partially dependent on the attribute X only if Course_nam it is dependent on a subset of attribute X.
Student_no e

Course-no

Instructor_na me

For example: Course_name,Instructor_name are partially dependent on composite attributes {Student_no,Course_no}, because Course_no alone defines Course_name,Instructor_name. Transitive Functional Dependency In a relation R, X,Y and Z are attributes.
X Y Y Z

Y Z

Example: Marks Grade Grade Remark Marks Remark Functional Dependency Theory: Let R be a relation schema. The Greek letters ,, and .are used to denote set of attributes. Then , R and R
The functional dependency holds on R if and only if for any legal relations r(R), whenever

any 2 tuples t1 and t2 of r agree on the attributes , they also agree on the attribute .

Prepared By: M.FARHANA SATHATH - MCA

That is,

t1[] = t2[] t1[] = t2[]

DECOMPOSITION The process of decomposing the universal relation schema R into a set of relation schemas D={R1,R2,,Rm} by using the functional dependencies, that will become the relational database schema D is called decomposite of R. Attribute preservation condition: Each attribute in R will appear in atleast one relation schema Ri in the decomposition so that no attributes are lost m

U Ri=R
i =1 Properties of decomposition: i. Dependency preservation ii. Lossless (or non additive) join property i. Dependency preservation: definition: Given a set of dependencies Fon R, the projection of F on Ri, denoted by Ri(F), where Ri subset of R, is the set of dependencies XY in F+ such that the attributes in XY are all contained in Ri. (i.e) when the decomposition of a relational schema preserve the associated set of functional dependencies. Formal definition: {F1 U F2 U..Fn}+ = F+ ii. Lossless (or non additive) join property: This property ensures that no spurious tuples are generated when a natural join operation is applied to the relations in the decomposition A decomposition D={R1,R2,.,Rm} of set R has the lossless join property with respect to the set of dependencies F on R, if every relation state r of R that satisfies F, the following hold: Is the natural join of all the relations in D: Ri(r),., Rm(r),) = r DATABASE NORMALIZATION Database normalization is the process of removing redundant data from your tables in to improve storage efficiency, data integrity, and scalability. In the relational model, methods exist for quantifying how efficient a database is. These classifications are called normal forms (or NF), and there are algorithms for converting a given database between them. Normalization generally involves splitting existing tables into multiple ones, which must be rejoined or linked each time a query is issued. NORMAL FORM

The normal forms break down large tables into smaller subsets. Edgar F. Codd originally established three normal forms: 1NF, 2NF and 3NF. There are now others that are generally accepted, but 3NF is widely considered to be sufficient for most applications. Most tables when reaching 3NF are also in BCNF (Boyce-Codd Normal Form).
3

Prepared By: M.FARHANA SATHATH - MCA

1. First normal form -1NF A relation is in 1NF if and only if all if all attribute values are atomic in nature (i.e) no repeating group, no composite attributes. o No repeating columns within a row and no composite attributes o No multi-valued columns. o 1NF simplifies attributes: Queries become easier. DPT_NO D101 MG_NO 12345 EMP_NO 20000 20001 20002 30000 30001 EMP_NO 20000 20001 20002 30000 EMP_NM Carl Sagan Mag James Larry Bird Jim Carter Paul Simon EMP_NM Carl Sagan Mag James Larry Bird Jim Carter The following table is not in 1NF

D102

13456

Table in 1NF DPT_NO D101 D101 D101 D102 MG_NO 12345 12345 12345 13456

D102 13456 30001 Paul Simon All attribute values are atomic because there are no repeating group and no composite attributes.

2. Second Normal Form ( 2NF) A relation is in 2NF if and only if it is in 1NF and all the attributes of the relation are fully functionally dependent on the whole attribute and not just a part of the key Second normal form (2NF) further addresses the concept of removing duplicative data: o R is 1NF o all non-prime attributes are fully dependent on the candidate keys. Which is creating relationships between these new tables and their predecessors through the use of foreign keys. A prime attribute appears in a candidate key. There is no partial dependency in 2NF. Inventory Description Supplier Cost Supplier Address A relation R is in 2NF if

No dependencies on non-key attributes

There are two non-key fields. So, here are the questions:
If I know just Description, can I find out Cost? No, because we have more than one supplier for the

same product.
If I know just Supplier, and I find out Cost? No, because I need to know what the Item is as well. 4

Prepared By: M.FARHANA SATHATH - MCA

Therefore, Cost is fully, functionally dependent upon the ENTIRE PK (Description-Supplier) for its existence. Inventory Description Supplier Inventory Description Supplier

Cost

Cost

Supplier Address

If I know just Description, can I find out Supplier Address? No, because we have more than one supplier for the same product. If I know just Supplier, and I find out Supplier Address? Yes. The Address does not depend upon the description of the item.

Therefore, Supplier Address is NOT functionally dependent upon the ENTIRE PK (Description-Supplier) for its existence. Supplier Name Supplier Address So putting things together: Inventory Description Supplier Cost Inventory Description Supplier Supplier Name Supplier Address The above relation is now in 2NF since the relation has no non-key attributes. 3. Third Normal Form (3NF) A relation is in 2NF if and only if it is in 2NF if there is no transitive functional dependency between non-key attributes

Supplier Address

Cost

Remove columns that are not dependent upon the primary key. So for every nontrivial functional dependency X --> A, (1) X is a superkey, or (2) A is a prime (key) attribute. Books Name Author's Name Author's Non-de Plume # of Pages

If I know # of Pages, can I find out Author's Name? No. Can I find out Author's Non-de Plume? No. If I know Author's Name, can I find out # of Pages? No. Can I find out Author's Non-de Plume? YES.

Therefore, Author's Non-de Plume is functionally dependent upon Author's Name, not the PK for its existence. It has to go.
5

Prepared By: M.FARHANA SATHATH - MCA

Books Name Author's Name Author Name Non-de Plume Boyce-Codd Normal Form (BCNF) A relation is in BCNF, if and only if, every determinant is a candidate key. The difference between 3NF and BCNF is that for a functional dependency A B, 3NF allows this dependency in a relation if B is a primary-key attribute and A is not a candidate key, whereas BCNF insists that for this dependency to remain in a relation, A must be a candidate key. Assume that a relation has more than one possible key. Assume further that the composite keys have a common attribute. If an attribute of a composite key is dependent on an attribute of the other composite key, a normalii-ation called BCNF is needed. Consider an example, the relation Professor: Professor (Professor Code, Dept, Head of Dept, Parent time) It is assumed that 1. A professor can work in more than one department 2. The percentage of the time he spends in each department is given. 3. Each department has only one Head of Department. The relationship diagram for the above relation is given in figure 8. Table 6 gives the relation attributes. The two possible composite keys are professor code and Dept. or Professor Code and Hcad of Dept. Observe that department as well as Head of Dept. are not non-key attributes. They are a part of a composite key. # of Pages

The relation given in table 6 is in 3NF. Observe, however, that the names of Dept. and Head of Dept. are duplicated. Further, if Professor P2 resigns, rows 3 and 4 are deleted. We lose the information that Rao is the Head of Department of Chemistry. The normalization of the relation is done by creating a new relation for Dept. and Head of Dept. and deleting Head of Dept. From Professor relation. The normalized relations areshown in the following table 7.

Prepared By: M.FARHANA SATHATH - MCA

MULTIVALUED DEPENDENCY: Multivalued dependency (MVD) X Y specified on relation schema R where X and Y are both subsets of R, specifies the following constraint on any relation state r of R. If two tuples t1 and t2 exist in r such that t1[x] = t2[x] when two tuples t3 and t4 should also exixt in r with the following properties where we use Z to denote (R- (X U Y)) t3[x] = t4[x] = t1[x] = t2[x] t3 [y] = t1[y] and t4[y] = t2[y] t3 [x] = t4[x] = t1[x] = t2[x] whenever X Y holds, we say that X multidetermines Y. 4. Fourth Normal Form (4NF) Relation R is in 4NF if and only if, whenever there exist subsets A and B of the attributes of R such that the (nontrivial) MVD A->>B is satisfied, then all attributes of R are also functionally dependent on A. The process of reducing interdependencies between columns (not tables) continues with 4NF. Assuming every teacher uses the same books when teaching the same course, the following table is not in 4NF: Course Teacher Textbook Physics Jones Physics Jones Physics Smith Physics Smith Math Math Math Jones Jones Jones Basic Mechanics Principles of Optics Basic Mechanics Principles of Optics Basic Mechanics Vector Analysis Trigonometry

Looking at it intuitively, theres a lot of duplication of information. If the school decided to drop Basic Mechanics in favor of Introduction to Mechanics, youd have a lot of work to update everything. The solution, again, is to break the table in two: Table: Course-Teacher Course Teacher Physics Jones Physics Smith Math Jones Table: Course-Text Course Text Physics Basic Mechanics Physics Principles of Optics Math Basic Mechanics Math Vector Analysis Math Trigonometry
7

JOIN-DEPENDENCY:

Prepared By: M.FARHANA SATHATH - MCA

A relation R satisfies join dependency (R!,R2,,Rn) if and only if R is equal to the join of R1, R2, , Rn where Ri are subsets of the set of attributes of R. 5. Fifth Normal Form A relation R is in 5NFalso called projection-join normal form (PJ/NF)if and only if every join dependency in R is implied by the candidate keys of R. The 5th Normal Form is a projection-based normalization approach where every nontrivial join dependency is implied by a candidate key. 5NF could split a single table into three or more. The goal: Agents represent companies, companies make products and agents sell products We might want to keep track of which agent sells which product for which company

Normalized relations:

Normal form First normal form (1NF) Second normal form (2NF) Third normal form (3NF)

Brief definition Table faithfully represents a relation and has no repeating groups No non-prime attribute in the table is functionally dependent on a proper subset of any candidate key Every non-prime attribute is non-transitively dependent on every candidate key in the table

Every non-trivial functional dependency in the table is either the Elementary Key Normal Form dependency of an elementary key attribute or a dependency on a (EKNF) superkey BoyceCodd (BCNF) normal form Every non-trivial functional dependency in the table is a dependency on a superkey Every non-trivial multivalued dependency in the table is a dependency on a superkey Every non-trivial join dependency in the table is implied by the superkeys of the table

Fourth normal form (4NF) Fifth normal form (5NF)

Prepared By: M.FARHANA SATHATH - MCA

Domain/key (DKNF)

normal

Sixth normal form (6NF)

form Every constraint on the table is a logical consequence of the table's domain constraints and key constraints Table features no non-trivial join dependencies at all (with reference to generalized join operator)