Database Normal Is at Ion

Database normalisation
[ 1- 6 ] NF, Boyce-codd.
PDF generated using the open source mwlib toolkit. See http://code.pediapress.com/ for more information.
PDF generated at: Mon, 21 Mar 2011 05:14:41 UTC
Contents
Articles
Database normalization 1
First normal form 9
Second normal form 15
Third normal form 19
Fourth normal form 22
Fifth normal form 24
Sixth normal form 27
Boyce–Codd normal form 28
References
Article Sources and Contributors 33
Image Sources, Licenses and Contributors 34
Article Licenses
License 35
Database normalization
In the design of a relational database management system (RDBMS), the process of organizing data to minimize
redundancy is called normalization. The goal of database normalization is to decompose relations with anomalies in
order to produce smaller, well-structured relations. Normalization usually involves dividing large, badly-formed
tables into smaller, well-formed tables and defining relationships between them. The objective is to isolate data so
that additions, deletions, and modifications of a field can be made in just one table and then propagated through the
rest of the database via the defined relationships.
Edgar F. Codd, the inventor of the relational model, introduced the concept of normalization and what we now know
as the First Normal Form (1NF) in 1970.[1] Codd went on to define the Second Normal Form (2NF) and Third
Normal Form (3NF) in 1971,[2] and Codd and Raymond F. Boyce defined the Boyce-Codd Normal Form (BCNF) in
1974.[3] Higher normal forms were defined by other theorists in subsequent years, the most recent being the Sixth
Normal Form (6NF) introduced by Chris Date, Hugh Darwen, and Nikos Lorentzos in 2002.[4]
Informally, a relational database table (the computerized representation of a relation) is often described as
"normalized" if it is in the Third Normal Form.[5] Most 3NF tables are free of insertion, update, and deletion
anomalies, i.e. in most cases 3NF tables adhere to BCNF, 4NF, and 5NF (but typically not 6NF).
A standard piece of database design guidance is that the designer should create a fully normalized design; selective
denormalization can subsequently be performed for performance reasons.[6] However, some modeling disciplines,
such as the dimensional modeling approach to data warehouse design, explicitly recommend non-normalized
designs, i.e. designs that in large part do not adhere to 3NF.[7]
Objectives of normalization
A basic objective of the first normal form defined by Codd in 1970 was to permit data to be queried and manipulated
using a "universal data sub-language" grounded in first-order logic.[8] (SQL is an example of such a data
sub-language, albeit one that Codd regarded as seriously flawed.)[9]
The objectives of normalization beyond 1NF (First Normal Form) were stated as follows by Codd:
1. To free the collection of relations from undesirable insertion, update and deletion dependencies;
2. To reduce the need for restructuring the collection of relations as new types of data are introduced,
and thus increase the life span of application programs;
3. To make the relational model more informative to users;
4. To make the collection of relations neutral to the query statistics, where these statistics are liable to
change as time goes by.
—E.F. Codd, "Further Normalization of the Data Base Relational Model"[10]
The sections below give details of each of these objectives.
Free the database of modification anomalies

When an attempt is made to modify (update,
insert into, or delete from) a table, undesired
side-effects may follow. Not all tables can suffer
from these side-effects; rather, the side-effects
can only arise in tables that have not been
sufficiently normalized. An insufficiently
normalized table might have one or more of the
following characteristics:
An update anomaly. Employee 519 is shown as having different addresses
• The same information can be expressed on
on different records.
multiple rows; therefore updates to the table
may result in logical inconsistencies. For
example, each record in an "Employees'
Skills" table might contain an Employee ID,
Employee Address, and Skill; thus a change
of address for a particular employee will
potentially need to be applied to multiple
records (one for each of his skills). If the
update is not carried through successfully—if,
that is, the employee's address is updated on
some records but not others—then the table is
An insertion anomaly. Until the new faculty member, Dr. Newsome, is
left in an inconsistent state. Specifically, the
assigned to teach at least one course, his details cannot be recorded.
table provides conflicting answers to the
question of what this particular employee's
address is. This phenomenon is known as an
update anomaly.
• There are circumstances in which certain

facts cannot be recorded at all. For example,
each record in a "Faculty and Their Courses"
table might contain a Faculty ID, Faculty
Name, Faculty Hire Date, and Course
Code—thus we can record the details of any A deletion anomaly. All information about Dr. Giddens is lost when he
faculty member who teaches at least one temporarily ceases to be assigned to any courses.
course, but we cannot record the details of a
newly-hired faculty member who has not yet been assigned to teach any courses except by setting the Course
Code to null. This phenomenon is known as an insertion anomaly.
• There are circumstances in which the deletion of data representing certain facts necessitates the deletion of data
representing completely different facts. The "Faculty and Their Courses" table described in the previous example
suffers from this type of anomaly, for if a faculty member temporarily ceases to be assigned to any courses, we
must delete the last of the records on which that faculty member appears, effectively also deleting the faculty
member. This phenomenon is known as a deletion anomaly.
Minimize redesign when extending the database structure

When a fully normalized database structure is extended to allow it to accommodate new types of data, the
pre-existing aspects of the database structure can remain largely or entirely unchanged. As a result, applications
interacting with the database are minimally affected.
Make the data model more informative to users

Normalized tables, and the relationship between one normalized table and another, mirror real-world concepts and
their interrelationships.
Avoid bias towards any particular pattern of querying

Normalized tables are suitable for general-purpose querying. This means any queries against these tables, including
future queries whose details cannot be anticipated, are supported. In contrast, tables that are not normalized lend
themselves to some types of queries, but not others.
For example, consider an online bookseller whose customers maintain wishlists of books they'd like to have. For the
obvious, anticipated query—what books does this customer want? -- it's enough to store the customer's wishlist in
the table as, say, a homogeneous string of authors and titles.
With this design, though, the database can answer only that one single query. It cannot by itself answer interesting
but unanticipated queries: What is the most-wished-for book? Which customers are interested in WWII espionage?
How does Lord Byron stack up against his contemporary poets? Answers to these questions must come from special
adaptive tools completely separate from the database. One tool might be software written especially to handle such
queries. This special adaptive software has just one single purpose: in effect to normalize the non-normalized field.
Unforeseen queries can be answered trivially, and entirely within the database framework, with a normalized table.
Example
Querying and manipulating the data within an unnormalized data structure, such as the following non-1NF
representation of customers' credit card transactions, involves more complexity than is really necessary:
Customer Jones Wilkins Stevens Transactions
Tr. ID Date Amount
12890 14-Oct-2003 −87
12904 15-Oct-2003 −50
Tr. ID Date Amount
12898 14-Oct-2003 −21

Tr. ID Date Amount
12907 15-Oct-2003 −18
14920 20-Nov-2003 −70
15003 27-Nov-2003 −60
To each customer there corresponds a repeating group of transactions. The automated evaluation of any query
relating to customers' transactions therefore would broadly involve two stages:
1. Unpacking one or more customers' groups of transactions allowing the individual transactions in a group to be
examined, and
2. Deriving a query result based on the results of the first stage
For example, in order to find out the monetary sum of all transactions that occurred in October 2003 for all
customers, the system would have to know that it must first unpack the Transactions group of each customer, then
sum the Amounts of all transactions thus obtained where the Date of the transaction falls in October 2003.
One of Codd's important insights was that this structural complexity could always be removed completely, leading to
much greater power and flexibility in the way queries could be formulated (by users and applications) and evaluated
(by the DBMS). The normalized equivalent of the structure above would look like this:
Customer Tr. ID Date Amount
Jones 12890 14-Oct-2003 −87
Jones 12904 15-Oct-2003 −50
Wilkins 12898 14-Oct-2003 −21
Stevens 12907 15-Oct-2003 −18
Stevens 14920 20-Nov-2003 −70
Stevens 15003 27-Nov-2003 −60
Now each row represents an individual credit card transaction, and the DBMS can obtain the answer of interest,
simply by finding all rows with a Date falling in October, and summing their Amounts. The data structure places all
of the values on an equal footing, exposing each to the DBMS directly, so each can potentially participate directly in
queries; whereas in the previous situation some values were embedded in lower-level structures that had to be
handled specially. Accordingly, the normalized design lends itself to general-purpose query processing, whereas the
unnormalized design does not.
Background to normalization: definitions

Functional dependency
In a given table, an attribute Y is said to have a functional dependency on a set of attributes X (written X → Y)
if and only if each X value is associated with precisely one Y value. For example, in an "Employee" table that
includes the attributes "Employee ID" and "Employee Date of Birth", the functional dependency {Employee
ID} → {Employee Date of Birth} would hold. It follows from the previous two sentences that each
{Employee ID} is associated with precisely one {Employee Date of Birth}. In reality this would not be the
case since an {Employee Date of Birth} might be null and thus an {Employee ID} might be associated with no
{Employee Date of Birth}.
Trivial functional dependency
A trivial functional dependency is a functional dependency of an attribute on a superset of itself. {Employee
ID, Employee Address} → {Employee Address} is trivial, as is {Employee Address} → {Employee
Address}.
Full functional dependency
An attribute is fully functionally dependent on a set of attributes X if it is
• functionally dependent on X, and
• not functionally dependent on any proper subset of X. {Employee Address} has a functional dependency on
{Employee ID, Skill}, but not a full functional dependency, because it is also dependent on {Employee ID}.
Transitive dependency
A transitive dependency is an indirect functional dependency, one in which X→Z only by virtue of X→Y and
Y→Z.
Multivalued dependency
A multivalued dependency is a constraint according to which the presence of certain rows in a table implies
the presence of certain other rows.
Join dependency
A table T is subject to a join dependency if T can always be recreated by joining multiple tables each having a
subset of the attributes of T.
Superkey
A superkey is a combination of attributes that can be used to uniquely identify a database record. A table
might have many superkeys.
Candidate key
A candidate key is a special subset of superkeys that do not have any extraneous information in them.
Examples: Imagine a table with the fields <Name>, <Age>, <SSN> and <Phone Extension>. This table has many
possible superkeys. Three of these are <SSN>, <Phone Extension, Name> and <SSN, Name>. Of those listed, only
<SSN> is a candidate key, as the others contain information not necessary to uniquely identify records ('SSN' here
refers to Social Security Number, which is unique to each person).
Non-prime attribute
A non-prime attribute is an attribute that does not occur in any candidate key. Employee Address would be a
non-prime attribute in the "Employees' Skills" table.
Primary key
Most DBMSs require a table to be defined as having a single unique key, rather than a number of possible
unique keys. A primary key is a key which the database designer has designated for this purpose.
Normal forms
The normal forms (abbrev. NF) of relational database theory provide criteria for determining a table's degree of
vulnerability to logical inconsistencies and anomalies. The higher the normal form applicable to a table, the less
vulnerable it is to inconsistencies and anomalies. Each table has a "highest normal form" (HNF): by definition, a
table always meets the requirements of its HNF and of all normal forms lower than its HNF; also by definition, a
table fails to meet the requirements of any normal form higher than its HNF.
The normal forms are applicable to individual tables; to say that an entire database is in normal form n is to say that
all of its tables are in normal form n.
Newcomers to database design sometimes suppose that normalization proceeds in an iterative fashion, i.e. a 1NF
design is first normalized to 2NF, then to 3NF, and so on. This is not an accurate description of how normalization
typically works. A sensibly designed table is likely to be in 3NF on the first attempt; furthermore, if it is 3NF, it is
overwhelmingly likely to have an HNF of 5NF. Achieving the "higher" normal forms (above 3NF) does not usually
require an extra expenditure of effort on the part of the designer, because 3NF tables usually need no modification to
meet the requirements of these higher normal forms.
The main normal forms are summarized below.
Normal form Defined by Brief definition
First normal form [11] Table faithfully represents a relation and has no repeating
Two versions: E.F. Codd (1970), C.J. Date (2003)
(1NF) groups
Second normal form [2] No non-prime attribute in the table is functionally

E.F. Codd (1971)
(2NF) dependent on a proper subset of a candidate key
Third normal form [2] Every non-prime attribute is non-transitively dependent on

E.F. Codd (1971); see +also Carlo Zaniolo's equivalent
(3NF) [12] every candidate key in the table
but differently-expressed definition (1982)
Boyce–Codd normal [13] Every non-trivial functional dependency in the table is a

Raymond F. Boyce and E.F. Codd (1974)
form (BCNF) dependency on a superkey
Fourth normal form [14] Every non-trivial multivalued dependency in the table is a
Ronald Fagin (1977)
(4NF) dependency on a superkey
Fifth normal form [15] Every non-trivial join dependency in the table is implied
Ronald Fagin (1979)
(5NF) by the superkeys of the table
Domain/key normal [16] Every constraint on the table is a logical consequence of

Ronald Fagin (1981)
form (DKNF) the table's domain constraints and key constraints
Sixth normal form [4] Table features no non-trivial join dependencies at all (with
C.J. Date, Hugh Darwen, and Nikos Lorentzos (2002)
(6NF) reference to generalized join operator)
Denormalization
Databases intended for online transaction processing (OLTP) are typically more normalized than databases intended
for online analytical processing (OLAP). OLTP applications are characterized by a high volume of small
transactions such as updating a sales record at a supermarket checkout counter. The expectation is that each
transaction will leave the database in a consistent state. By contrast, databases intended for OLAP operations are
primarily "read mostly" databases. OLAP applications tend to extract historical data that has accumulated over a
long period of time. For such databases, redundant or "denormalized" data may facilitate business intelligence
applications. Specifically, dimensional tables in a star schema often contain denormalized data. The denormalized or
redundant data must be carefully controlled during extract, transform, load (ETL) processing, and users should not
be permitted to see the data until it is in a consistent state. The normalized alternative to the star schema is the
snowflake schema. In many cases, the need for denormalization has waned as computers and RDBMS software have
become more powerful, but since data volumes have generally increased along with hardware and software
performance, OLAP databases often still use denormalized schemas.
Denormalization is also used to improve performance on smaller computers as in computerized cash-registers and
mobile devices, since these may use the data for look-up only (e.g. price lookups). Denormalization may also be
used when no RDBMS exists for a platform (such as Palm), or no changes are to be made to the data and a swift
response is crucial.
Non-first normal form (NF² or N1NF)

In recognition that denormalization can be deliberate and useful, the non-first normal form is a definition of database
designs which do not conform to first normal form, by allowing "sets and sets of sets to be attribute domains" (Schek
1982). The languages used to query and manipulate data in the model must be extended accordingly to support such
values.
One way of looking at this is to consider such structured values as being specialized types of values (domains), with
their own domain-specific languages. However, what is usually meant by non-1NF models is the approach in which
the relational model and the languages used to query it are extended with a general mechanism for such structure; for
instance, the nested relational model supports the use of relations as domain values, by adding two additional
operators (nest and unnest) to the relational algebra that can create and flatten nested relations, respectively.
Consider the following table:
First Normal Form

Person Favorite Color
Bob blue
Bob red
Jane green
Jane yellow
Jane red
Assume a person has several favorite colors. Obviously, favorite colors consist of a set of colors modeled by the
given table. To transform a 1NF into an NF² table a "nest" operator is required which extends the relational algebra
of the higher normal forms. Applying the "nest" operator to the 1NF table yields the following NF² table:
Non-First Normal Form

Person Favorite Colors
Bob
Favorite Color
blue
red
Jane
Favorite Color
green
yellow
red
To transform this NF² table back into a 1NF an "unnest" operator is required which extends the relational algebra of
the higher normal forms. The unnest, in this case, would make "colors" into its own table.
Although "unnest" is the mathematical inverse to "nest", the operator "nest" is not always the mathematical inverse
of "unnest". Another constraint required is for the operators to be bijective, which is covered by the Partitioned
Normal Form (PNF).
Notes and references

[1] Codd, E.F. (June 1970). "A Relational Model of Data for Large Shared Data Banks" (http:/ / www. acm. org/ classics/ nov95/ toc. html).
Communications of the ACM 13 (6): 377–387. doi:10.1145/362384.362685. .
[2] Codd, E.F. "Further Normalization of the Data Base Relational Model". (Presented at Courant Computer Science Symposia Series 6, "Data
Base Systems", New York City, May 24–25, 1971.) IBM Research Report RJ909 (August 31, 1971). Republished in Randall J. Rustin (ed.),
Data Base Systems: Courant Computer Science Symposia Series 6. Prentice-Hall, 1972.
[3] Codd, E. F. "Recent Investigations into Relational Data Base Systems". IBM Research Report RJ1385 (April 23, 1974). Republished in Proc.
1974 Congress (Stockholm, Sweden, 1974). , N.Y.: North-Holland (1974).
[4] C.J. Date, Hugh Darwen, Nikos Lorentzos. Temporal Data and the Relational Model. Morgan Kaufmann (2002), p. 176
[5] C.J. Date. An Introduction to Database Systems. Addison-Wesley (1999), p. 290
[6] Chris Date, for example, writes: "I believe firmly that anything less than a fully normalized design is strongly contraindicated ... [Y]ou should
"denormalize" only as a last resort. That is, you should back off from a fully normalized design only if all other strategies for improving
performance have somehow failed to meet requirements." Date, C.J. Database in Depth: Relational Theory for Practitioners. O'Reilly (2005),
p. 152.
[7] Ralph Kimball, for example, writes: "The use of normalized modeling in the data warehouse presentation area defeats the whole purpose of
data warehousing, namely, intuitive and high-performance retrieval of data." Kimball, Ralph. The Data Warehouse Toolkit, 2nd Ed.. Wiley
Computer Publishing (2002), p. 11.
[8] "The adoption of a relational model of data ... permits the development of a universal data sub-language based on an applied predicate
calculus. A first-order predicate calculus suffices if the collection of relations is in first normal form. Such a language would provide a
yardstick of linguistic power for all other proposed data languages, and would itself be a strong candidate for embedding (with appropriate
syntactic modification) in a variety of host Ianguages (programming, command- or problem-oriented)." Codd, "A Relational Model of Data
for Large Shared Data Banks" (http:/ / www. acm. org/ classics/ nov95/ toc. html), p. 381
[9] Codd, E.F. Chapter 23, "Serious Flaws in SQL", in The Relational Model for Database Management: Version 2. Addison-Wesley (1990), pp.
371–389
[10] Codd, E.F. "Further Normalization of the Data Base Relational Model", p. 34
[11] Date, C. J. "What First Normal Form Really Means" in Date on Database: Writings 2000–2006 (Springer-Verlag, 2006), pp. 127–128.
[12] Zaniolo, Carlo. "A New Normal Form for the Design of Relational Database Schemata." ACM Transactions on Database Systems 7(3),
September 1982.
[13] Codd, E. F. "Recent Investigations into Relational Data Base Systems". IBM Research Report RJ1385 (April 23, 1974). Republished in
Proc. 1974 Congress (Stockholm, Sweden, 1974). New York, N.Y.: North-Holland (1974).
[14] Fagin, Ronald (September 1977). "Multivalued Dependencies and a New Normal Form for Relational Databases" (http:/ / www. almaden.
ibm. com/ cs/ people/ fagin/ tods77. pdf). ACM Transactions on Database Systems 2 (1): 267. doi:10.1145/320557.320571. .
[15] Ronald Fagin. "Normal Forms and Relational Database Operators". ACM SIGMOD International Conference on Management of Data, May
31-June 1, 1979, Boston, Mass. Also IBM Research Report RJ2471, Feb. 1979.
[16] Ronald Fagin (1981) A Normal Form for Relational Databases That Is Based on Domains and Keys (http:/ / www. almaden. ibm. com/ cs/
people/ fagin/ tods81. pdf), Communications of the ACM, vol. 6, pp. 387–415
• Paper: "Non First Normal Form Relations" by G. Jaeschke, H. -J Schek ; IBM Heidelberg Scientific Center. ->
Paper studying normalization and denormalization operators nest and unnest as mildly described at the end of this
wiki page.
Further reading
• Litt's Tips: Normalization (http://www.troubleshooters.com/littstip/ltnorm.html)
• Date, C. J. (1999), An Introduction to Database Systems (http://www.aw-bc.com/catalog/academic/product/
0,1144,0321197844,00.html) (8th ed.). Addison-Wesley Longman. ISBN 0-321-19784-4.
• Kent, W. (1983) A Simple Guide to Five Normal Forms in Relational Database Theory (http://www.bkent.net/
Doc/simple5.htm), Communications of the ACM, vol. 26, pp. 120–125
• Date, C.J., & Darwen, H., & Pascal, F. Database Debunkings (http://www.dbdebunk.com)
• H.-J. Schek, P. Pistor Data Structures for an Integrated Data Base Management and Information Retrieval System
External links
• Database Normalization Basics (http://databases.about.com/od/specificproducts/a/normalization.htm) by
Mike Chapple (About.com)
• Database Normalization Intro (http://www.databasejournal.com/sqletc/article.php/1428511), Part 2 (http://
www.databasejournal.com/sqletc/article.php/26861_1474411_1)
• An Introduction to Database Normalization (http://dev.mysql.com/tech-resources/articles/
intro-to-normalization.html) by Mike Hillyer.
• Normalization (http://www.utexas.edu/its/windows/database/datamodeling/rm/rm7.html) by ITS,
University of Texas.
• A tutorial on the first 3 normal forms (http://phlonx.com/resources/nf3/) by Fred Coulson
• DB Normalization Examples (http://www.dbnormalization.com/)
• Description of the database normalization basics (http://support.microsoft.com/kb/283878) by Microsoft
• Database Normalization and Design Techniques (http://www.barrywise.com/2008/01/
database-normalization-and-design-techniques/) by Barry Wise, recommended reading for the Harvard MIS.
• A Simple Guide to Five Normal Forms in Relational Database Theory (http://www.bkent.net/Doc/simple5.
htm)
First normal form

First normal form (1NF or Minimal Form) is a normal form used in database normalization. A relational database
table that adheres to 1NF is one that meets a certain minimum set of criteria. These criteria are basically concerned
with ensuring that the table is a faithful representation of a relation[1] and that it is free of repeating groups.[2]
The concept of a "repeating group" is, however, understood in different ways by different theorists. As a
consequence, there is no universal agreement as to which features would disqualify a table from being in 1NF. Most
notably, 1NF as defined by some authors (for example, Ramez Elmasri and Shamkant B. Navathe,[3] following the
precedent established by Edgar F. Codd) excludes relation-valued attributes (tables within tables); whereas 1NF as
defined by other authors (for example, Chris Date) permits them.
1NF tables as representations of relations

According to Date's definition of 1NF, a table is in 1NF if and only if it is "isomorphic to some relation", which
means, specifically, that it satisfies the following five conditions:
1. There's no top-to-bottom ordering to the rows.
2. There's no left-to-right ordering to the columns.
3. There are no duplicate rows.
4. Every row-and-column intersection contains exactly one value from the applicable domain (and nothing else).
5. All columns are regular [i.e. rows have no hidden components such as row IDs, object IDs, or hidden
timestamps].
—Chris Date, "What First Normal Form Really Means", pp. 127–8[4]
Violation of any of these conditions would mean that the table is not strictly relational, and therefore that it is not in
1NF.
Examples of tables (or views) that would not meet this definition of 1NF are:
• A table that lacks a unique key. Such a table would be able to accommodate duplicate rows, in violation of
condition 3.
First normal form 10
• A view whose definition mandates that results be returned in a particular order, so that the row-ordering is an
intrinsic and meaningful aspect of the view.[5] This violates condition 1. The tuples in true relations are not
ordered with respect to each other.
• A table with at least one nullable attribute. A nullable attribute would be in violation of condition 4, which
requires every field to contain exactly one value from its column's domain. It should be noted, however, that this
aspect of condition 4 is controversial. It marks an important departure from Codd's later vision of the relational
model,[6] which made explicit provision for nulls.[7]
Repeating groups
Date's fourth condition, which expresses "what most people think of as the defining feature of 1NF",[8] is concerned
with repeating groups. The following scenario illustrates how a database design might incorporate repeating groups,
in violation of 1NF.
Domains and values

Suppose a novice designer wishes to record the names and telephone numbers of customers. He defines a customer
table which looks like this:
Customer
Customer ID First Name Surname Telephone Number
123 Robert Ingram 555-861-2025
456 Jane Wright 555-403-1659
789 Maria Fernandez 555-808-9633
The designer then becomes aware of a requirement to record multiple telephone numbers for some customers. He
reasons that the simplest way of doing this is to allow the "Telephone Number" field in any given record to contain
more than one value:
Customer
Customer ID First Name Surname Telephone Number
456 Jane Wright 555-403-1659

555-776-4100
Assuming, however, that the Telephone Number column is defined on some Telephone Number-like domain (e.g.
the domain of strings 12 characters in length), the representation above is not in 1NF. 1NF (and, for that matter, the
RDBMS) prevents a single field from containing more than one value from its column's domain.
Repeating groups across columns

The designer might attempt to get around this restriction by defining multiple Telephone Number columns:
Customer
Customer ID First Name Surname Tel. No. 1 Tel. No. 2 Tel. No. 3
456 Jane Wright 555-403-1659 555-776-4100 555-403-1659
This representation, however, makes use of nullable columns, and therefore does not conform to Date's definition of
1NF. Even if the view is taken that nullable columns are allowed, the design is not in keeping with the spirit of 1NF.
Tel. No. 1, Tel. No. 2., and Tel. No. 3. share exactly the same domain and exactly the same meaning; the splitting of
Telephone Number into three headings is artificial and causes logical problems. These problems include:
• Difficulty in querying the table. Answering such questions as "Which customers have telephone number X?" and
"Which pairs of customers share a telephone number?" is awkward.
• Inability to enforce uniqueness of Customer-to-Telephone Number links through the RDBMS. Customer 789
might mistakenly be given a Tel. No. 2 value that is exactly the same as her Tel. No. 1 value.
• Restriction of the number of telephone numbers per customer to three. If a customer with four telephone numbers
comes along, we are constrained to record only three and leave the fourth unrecorded. This means that the
database design is imposing constraints on the business process, rather than (as should ideally be the case)
vice-versa.
Repeating groups within columns

The designer might, alternatively, retain the single Telephone Number column but alter its domain, making it a string
of sufficient length to accommodate multiple telephone numbers:
Customer
Customer ID First Name Surname Telephone Numbers
456 Jane Wright 555-403-1659, 555-776-4100
This design is consistent with 1NF, but still presents several design issues. The Telephone Number heading becomes
semantically non-specific, as it can now represent either a telephone number, a list of telephone numbers, or indeed
anything at all. A query such as "Which pairs of customers share a telephone number?" is more difficult to
formulate, given the necessity to cater for lists of telephone numbers as well as individual telephone numbers.
Meaningful constraints on telephone numbers are also very difficult to define in the RDBMS with this design.
A design that complies with 1NF

A design that is unambiguously in 1NF makes use of two tables: a Customer Name table and a Customer Telephone
Number table.
Customer Name
Customer ID First Name Surname
123 Robert Ingram
456 Jane Wright
789 Maria Fernandez
Customer Telephone Number

Customer ID Telephone Number
123 555-861-2025
456 555-403-1659
456 555-776-4100
789 555-808-9633
Repeating groups of telephone numbers do not occur in this design. Instead, each Customer-to-Telephone Number
link appears on its own record. With Customer ID as key fields, a "parent-child" or one-to-many (1:M) relationship
exists between the two tables, since a customer record (in the "parent" table) can have many telephone number
records (in the "child" table), but each telephone number usually has one, and only one customer. It is worth noting
that this design meets the additional requirements for second and third normal form (3NF).
Atomicity
Some definitions of 1NF, most notably that of Edgar F. Codd, make reference to the concept of atomicity. Codd
states that the "values in the domains on which each relation is defined are required to be atomic with respect to the
DBMS."[9] Codd defines an atomic value as one that "cannot be decomposed into smaller pieces by the DBMS
(excluding certain special functions)."[10] Meaning a field should not be divided into parts with more than one kind
of data in it such that what one part means to the DBMS depends on another part of the same field.
Hugh Darwen and Chris Date have suggested that Codd's concept of an "atomic value" is ambiguous, and that this
ambiguity has led to widespread confusion about how 1NF should be understood.[11] [12] In particular, the notion of a
"value that cannot be decomposed" is problematic, as it would seem to imply that few, if any, data types are atomic:
• A character string would seem not to be atomic, as the RDBMS typically provides operators to decompose it into
substrings.
• A fixed-point number would seem not to be atomic, as the RDBMS typically provides operators to decompose it
into integer and fractional components.
Date suggests that "the notion of atomicity has no absolute meaning":[13] a value may be considered atomic for some
purposes, but may be considered an assemblage of more basic elements for other purposes. If this position is
accepted, 1NF cannot be defined with reference to atomicity. Columns of any conceivable data type (from string
types and numeric types to array types and table types) are then acceptable in a 1NF table—although perhaps not
always desirable (For example, it would be more desirable to separate a Customer Name field into two separate
fields as First Name, Surname). Date argues that relation-valued attributes, by means of which a field within a table
can contain a table, are useful in rare cases.[14]
Normalization beyond 1NF

Any table that is in second normal form (2NF) or higher is, by definition, also in 1NF (each normal form has more
stringent criteria than its predecessor). On the other hand, a table that is in 1NF may or may not be in 2NF; if it is in
2NF, it may or may not be in 3NF, and so on.
Normal forms higher than 1NF are intended to deal with situations in which a table suffers from design problems
that may compromise the integrity of the data within it. For example, the following table is in 1NF, but is not in 2NF
and therefore is vulnerable to logical inconsistencies:
Subscriber Email Addresses

Subscriber ID Email Address Subscriber First Name Subscriber Surname
108 steve@aardvarkmail.net Steve Wallace
252 carol@mongoosemail.org Carol Robertson
252 crobertson@aardvarkmail.net Carol Robertson
360 hclark@antelopemail.com Harriet Clark
The table's key is {Subscriber ID, Email Address}.

If Carol Robertson changes her surname by marriage, the change must be applied to two rows. If the change is only
applied to one row, a contradiction results: the question "What is Customer 252's name?" has two conflicting
answers. 2NF addresses this problem. Note that Carol Robertson's record is appearing in the table twice because it
has more than one email address related to it.
A practical way to think of 1NF in the above table is to ask a series of questions about the relationships that records
(rows) can have between entities (tables) or attributes (columns), based on given business rules or constraints. For
example, could a Subscriber record relate to many Email Address records? Could an Email Address record relate to
many Subscriber records? In the above table we can see that Carol Robertson has more than one email address. We
could answer the questions by saying there's a one-to-many relationship (1:M) between Subscriber and Email
Address in the above table, since a subscriber can have many email addresses, and an email address usually has one,
and only one subscriber. We would then create a separate table called Subscribers and move the Subscriber First
Name and Subscriber Surname columns from the Subscriber Email Addresses table into the new Subscribers table,
adding a third column Subscriber ID as the primary key. Thus, a one-to-many relationship exists between the
Subscribers table (with Subscriber ID as the primary key) and the Subscriber Email Addresses table (with Subscriber
ID as the foreign key). The tables would conform to 1NF.
References
[1] "[T]he overriding requirement, to the effect that the table must directly and faithfully represent a relation, follows from the fact that 1NF was
originally defined as a property of relations, not tables." Date, C. J. "What First Normal Form Really Means" (http:/ / www. dbdebunk. com/
page/ page/ 629796. htm) in Date on Database: Writings 2000-2006 (Springer-Verlag, 2006), p. 128.
[2] "First normal form excludes variable repeating fields and groups." Kent, William. "A Simple Guide to Five Normal Forms in Relational
Database Theory" (http:/ / www. bkent. net/ Doc/ simple5. htm), Communications of the ACM 26 (2), Feb. 1983, pp. 120–125.
[3] Elmasri, Ramez and Navathe, Shamkant B. Fundamentals of Database Systems, Fourth Edition (Addison-Wesley, 2003), p. 315.
[4] Date, C. J. "What First Normal Form Really Means" (http:/ / www. dbdebunk. com/ page/ page/ 629796. htm) pp. 127–128.
[5] Such views cannot be created using SQL that conforms to the SQL:2003 standard.
[6] "Codd first defined the relational model in 1969 and didn't introduce nulls until 1979" Date, C. J. SQL and Relational Theory (O'Reilly,
2009), Appendix A.2.
[7] The third of Codd's 12 rules states that "Null values ... [must be] supported in a fully relational DBMS for representing missing information
and inapplicable information in a systematic way, independent of data type." Codd, E. F. "Is Your DBMS Really Relational?" Computerworld,
October 14, 1985.
[8] Date, C. J. "What First Normal Form Really Means" (http:/ / www. dbdebunk. com/ page/ page/ 629796. htm) p. 128.
[9] Codd, E. F. The Relational Model for Database Management Version 2 (Addison-Wesley, 1990).
[10] Codd, E. F. The Relational Model for Database Management Version 2 (Addison-Wesley, 1990), p. 6.
[11] Darwen, Hugh. "Relation-Valued Attributes; or, Will the Real First Normal Form Please Stand Up?", in C. J. Date and Hugh Darwen,
Relational Database Writings 1989-1991 (Addison-Wesley, 1992).
[12] "[F]or many years," writes Date, "I was as confused as anyone else. What's worse, I did my best (worst?) to spread that confusion through
my writings, seminars, and other presentations." Date, C. J. "What First Normal Form Really Means" (http:/ / www. dbdebunk. com/ page/
page/ 629796. htm) in Date on Database: Writings 2000-2006 (Springer-Verlag, 2006), p. 108
[13] Date, C. J. "What First Normal Form Really Means" (http:/ / www. dbdebunk. com/ page/ page/ 629796. htm) p. 112.
[14] Date, C. J. "What First Normal Form Really Means" (http:/ / www. dbdebunk. com/ page/ page/ 629796. htm) pp. 121–126.
Further reading
• Rules Of Data Normalization (http://www.datamodel.org/NormalizationRules.html)
• Date, C. J., & Lorentzos, N., & Darwen, H. (2002). Temporal Data & the Relational Model (http://www.
elsevier.com/wps/product/cws_home/680662) (1st ed.). Morgan Kaufmann. ISBN 1-55860-855-9.
• Date, C. J., & Darwen, H., & Pascal, F. Database Debunkings (http://www.dbdebunk.com)
External links
• Rules of Data Normalization (http://www.datamodel.org/NormalizationRules.html) by Data Model.org
Second normal form

Second normal form (2NF) is a normal form used in database normalization. 2NF was originally defined by E.F.
Codd in 1971.[1] A table that is in first normal form (1NF) must meet additional criteria if it is to qualify for second
normal form. Specifically: a 1NF table is in 2NF if and only if, given any candidate key K and any attribute A that is
not a constituent of a candidate key, A depends upon the whole of K rather than just a part of it.
In slightly more formal terms: a 1NF table is in 2NF if and only if all its non-prime attributes are functionally
dependent on the whole of every candidate key. (A non-prime attribute is one that does not belong to any candidate
key.)
Note that when a 1NF table has no composite candidate keys (candidate keys consisting of more than one attribute),
the table is automatically in 2NF.
Example
Consider a table describing employees' skills:
Employees' Skills
Employee Skill Current Work Location
Jones Typing 114 Main Street
Jones Shorthand 114 Main Street
Jones Whittling 114 Main Street
Bravo Light Cleaning 73 Industrial Way
Ellis Alchemy 73 Industrial Way
Ellis Flying 73 Industrial Way
Harrison Light Cleaning 73 Industrial Way
Neither {Employee} nor {Skill} is a candidate key for the table. This is because a given Employee might need to
appear more than once (he might have multiple Skills), and a given Skill might need to appear more than once (it
might be possessed by multiple Employees). Only the composite key {Employee, Skill} qualifies as a candidate key
for the table.
The remaining attribute, Current Work Location, is dependent on only part of the candidate key, namely Employee.
Therefore the table is not in 2NF. Note the redundancy in the way Current Work Locations are represented: we are
told three times that Jones works at 114 Main Street, and twice that Ellis works at 73 Industrial Way. This
redundancy makes the table vulnerable to update anomalies: it is, for example, possible to update Jones' work
location on his "Typing" and "Shorthand" records and not update his "Whittling" record. The resulting data would
imply contradictory answers to the question "What is Jones' current work location?"
A 2NF alternative to this design would represent the same information in two tables: an "Employees" table with
candidate key {Employee}, and an "Employees' Skills" table with candidate key {Employee, Skill}:
Employees
Employee Current Work Location
Jones 114 Main Street
Bravo 73 Industrial Way
Ellis 73 Industrial Way
Harrison 73 Industrial Way
Employees' Skills
Employee Skill
Jones Typing
Jones Shorthand
Jones Whittling
Bravo Light Cleaning
Ellis Alchemy
Ellis Flying
Harrison Light Cleaning
Neither of these tables can suffer from update anomalies.

Not all 2NF tables are free from update anomalies, however. An example of a 2NF table which suffers from update
anomalies is:
Tournament Winners
Tournament Year Winner Winner Date of Birth
Des Moines Masters 1998 Chip Masterson 14 March 1977
Indiana Invitational 1998 Al Fredrickson 21 July 1975
Cleveland Open 1999 Bob Albertson 28 September 1968
Des Moines Masters 1999 Al Fredrickson 21 July 1975
Indiana Invitational 1999 Chip Masterson 14 March 1977
Even though Winner and Winner Date of Birth are determined by the whole key {Tournament / Year} and not part
of it, particular Winner / Winner Date of Birth combinations are shown redundantly on multiple records. This leads
to an update anomaly: if updates are not carried out consistently, a particular winner could be shown as having two
different dates of birth.
The underlying problem is the transitive dependency to which the Winner Date of Birth attribute is subject. Winner
Date of Birth actually depends on Winner, which in turn depends on the key Tournament / Year.
This problem is addressed by third normal form (3NF).
2NF and candidate keys

A table for which there are no partial functional dependencies on the primary key is typically, but not always, in
2NF. In addition to the primary key, the table may contain other candidate keys; it is necessary to establish that no
non-prime attributes have part-key dependencies on any of these candidate keys.
Multiple candidate keys occur in the following table:
Electric Toothbrush Models

Manufacturer Model Model Full Name Manufacturer Country
Forte X-Prime Forte X-Prime Italy
Forte Ultraclean Forte Ultraclean Italy
Dent-o-Fresh EZbrush Dent-o-Fresh EZBrush USA
Kobayashi ST-60 Kobayashi ST-60 Japan
Hoch Toothmaster Hoch Toothmaster Germany
Hoch X-Prime Hoch X-Prime Germany
Even if the designer has specified the primary key as {Model Full Name}, the table is not in 2NF. {Manufacturer,
Model} is also a candidate key, and Manufacturer Country is dependent on a proper subset of it: Manufacturer. To
make the design conform to 2NF, it is necessary to have two tables:
Electric Toothbrush Manufacturers

Manufacturer Manufacturer Country
Forte Italy
Dent-o-Fresh USA
Kobayashi Japan
Hoch Germany
Electric Toothbrush Models

Manufacturer Model Model Full Name
Forte X-Prime Forte X-Prime
Forte Ultraclean Forte Ultraclean
Dent-o-Fresh EZbrush Dent-o-Fresh EZBrush
Kobayashi ST-60 Kobayashi ST-60
Hoch Toothmaster Hoch Toothmaster
Hoch X-Prime Hoch X-Prime

References
[1] Codd, E.F. "Further Normalization of the Data Base Relational Model." (Presented at Courant Computer Science Symposia Series 6, "Data
Base Systems," New York City, May 24th-25th, 1971.) IBM Research Report RJ909 (August 31st, 1971). Republished in Randall J. Rustin
(ed.), Data Base Systems: Courant Computer Science Symposia Series 6. Prentice-Hall, 1972.
Further reading
• Date, C. J., & Lorentzos, N., & Darwen, H. (2002). Temporal Data & the Relational Model (http://www.
elsevier.com/wps/product/cws_home/680662) (1st ed.). Morgan Kaufmann. ISBN 1-55860-855-9.
• C.J.Date (2004). Introduction to Database Systems (8th ed.). Boston: Addison-Wesley. ISBN 9780321197849.
External links
Third normal form

The third normal form (3NF) is a normal form used in database normalization. 3NF was originally defined by E.F.
Codd in 1971.[1] Codd's definition states that a table is in 3NF if and only if both of the following conditions hold:
• The relation R (table) is in second normal form (2NF)
• Every non-prime attribute of R is non-transitively dependent (i.e. directly dependent) on every candidate key of
R.
A non-prime attribute of R is an attribute that does not belong to any candidate key of R.[2] A transitive
dependency is a functional dependency in which X → Z (X determines Z) indirectly, by virtue of X → Y and Y → Z
(where it is not the case that Y → X).[3]
A 3NF definition that is equivalent to Codd's, but expressed differently, was given by Carlo Zaniolo in 1982. This
definition states that a table is in 3NF if and only if, for each of its functional dependencies X → A, at least one of
the following conditions holds:
• X contains A (that is, X → A is trivial functional dependency), or
• X is a superkey, or
• A-X, the set difference between A and X is a prime attribute (i.e., A-X is contained within a candidate key)[4]
Zaniolo's definition gives a clear sense of the difference between 3NF and the more stringent Boyce–Codd normal
form (BCNF). BCNF simply eliminates the third alternative ("A is a prime attribute").
"Nothing but the key"

A memorable summary of Codd's definition of 3NF, paralleling the traditional pledge to give true evidence in a court
of law, was given by Bill Kent: every non-key attribute "must provide a fact about the key, the whole key, and
nothing but the key."[5] A common variation supplements this definition with the oath: "so help me Codd".[6]
Requiring that non-key attributes be dependent on "the whole key" ensures that a table is in 2NF; further requiring
that non-key attributes be dependent on "nothing but the key" ensures that the table is in 3NF.
Chris Date refers to Kent's summary as "an intuitively attractive characterization" of 3NF, and notes that with slight
adaptation it may serve as a definition of the slightly-stronger Boyce–Codd normal form: "Each attribute must
represent a fact about the key, the whole key, and nothing but the key."[7] The 3NF version of the definition is
weaker than Date's BCNF variation, as the former is concerned only with ensuring that non-key attributes are
dependent on keys. Prime attributes (which are keys or parts of keys) must not be functionally dependent at all; they
each represent a fact about the key in the sense of providing part or all of the key itself. (It should be noted here that
this rule applies only to functionally dependent attributes, as applying it to all attributes would implicitly prohibit
composite candidate keys, since each part of any such key would violate the "whole key" clause.)
An example of a 2NF table that fails to meet the requirements of 3NF is:
Tournament Winners
Tournament Year Winner Winner Date of Birth
Indiana Invitational 1998 Al Fredrickson 21 July 1975
Cleveland Open 1999 Bob Albertson 28 September 1968
Des Moines Masters 1999 Al Fredrickson 21 July 1975
Indiana Invitational 1999 Chip Masterson 14 March 1977
Because each row in the table needs to tell us who won a particular Tournament in a particular Year, the composite
key {Tournament, Year} is a minimal set of attributes guaranteed to uniquely identify a row. That is, {Tournament,
Year} is a candidate key for the table.
The breach of 3NF occurs because the non-prime attribute Winner Date of Birth is transitively dependent on the
candidate key {Tournament, Year} via the non-prime attribute Winner. The fact that Winner Date of Birth is
functionally dependent on Winner makes the table vulnerable to logical inconsistencies, as there is nothing to stop
the same person from being shown with different dates of birth on different records.
In order to express the same facts without violating 3NF, it is necessary to split the table into two:
Tournament Winners
Tournament Year Winner
Indiana Invitational 1998 Al Fredrickson
Cleveland Open 1999 Bob Albertson
Des Moines Masters 1999 Al Fredrickson
Indiana Invitational 1999 Chip Masterson
Player Dates of Birth

Player Date of Birth
Chip Masterson 14 March 1977
Al Fredrickson 21 July 1975
Bob Albertson 28 September 1968
Update anomalies cannot occur in these tables, which are both in 3NF.
Derivation of Zaniolo's conditions

The definition of 3NF offered by Carlo Zaniolo in 1982, and given above, is proved in the following way: Let X →
A be a nontrivial FD (i.e. one where X does not contain A) and let A be a non-key attribute. Also let Y be a key of R.
Then Y → X. Therefore A is not transitively dependent on Y if and only if X → Y, that is, if and only if X is a
superkey.[8]
Normalization beyond 3NF

Most 3NF tables are free of update, insertion, and deletion anomalies. Certain types of 3NF tables, rarely met with in
practice, are affected by such anomalies; these are tables which either fall short of Boyce–Codd normal form
(BCNF) or, if they meet BCNF, fall short of the higher normal forms 4NF or 5NF.
References
[1] Codd, E.F. "Further Normalization of the Data Base Relational Model." (Presented at Courant Computer Science Symposia Series 6, "Data
Base Systems," New York City, May 24th–25th, 1971.) IBM Research Report RJ909 (August 31st, 1971). Republished in Randall J. Rustin
(ed.), Data Base Systems: Courant Computer Science Symposia Series 6. Prentice-Hall, 1972.
[2] Codd, p. 43.
[3] Codd, p. 45–46.
September 1982.
[5] Kent, William. "A Simple Guide to Five Normal Forms in Relational Database Theory" (http:/ / www. bkent. net/ Doc/ simple5. htm),
Communications of the ACM 26 (2), Feb. 1983, pp. 120–125.
[6] The author of a 1989 book on database management credits one of his students with coming up with the "so help me Codd" addendum. Diehr,
George. Database Management (Scott, Foresman, 1989), p. 331.
[7] Date, C.J. An Introduction to Database Systems (7th ed.) (Addison Wesley, 2000), p. 379.
[8] Zaniolo, p. 494.
Further reading
External links
Fourth normal form

Fourth normal form (4NF) is a normal form used in database normalization. Introduced by Ronald Fagin in 1977,
4NF is the next level of normalization after Boyce–Codd normal form (BCNF). Whereas the second, third, and
Boyce–Codd normal forms are concerned with functional dependencies, 4NF is concerned with a more general type
of dependency known as a multivalued dependency. A table is in 4NF if and only if, for every one of its non-trivial
multivalued dependencies X →→ Y, X is a superkey—that is, X is either a candidate key or a superset thereof.[1]
Multivalued dependencies
If the column headings in a relational database table are divided into three disjoint groupings X, Y, and Z, then, in the
context of a particular row, we can refer to the data beneath each group of headings as x, y, and z respectively. A
multivalued dependency X →→ Y signifies that if we choose any x actually occurring in the table (call this choice
xc), and compile a list of all the xcyz combinations that occur in the table, we will find that xc is associated with the
same y entries regardless of z.
A trivial multivalued dependency X →→ Y is one in which Y consists of all columns belonging to X. That is, a
subset of attributes in a table has a trivial multivalued dependency on the remaining subset of attributes.
A functional dependency is a special case of multivalued dependency. In a functional dependency X → Y, every x
determines exactly one y, never more than one.
Example
Consider the following example:
Pizza Delivery Permutations

Restaurant Pizza Variety Delivery Area
A1 Pizza Thick Crust Springfield
A1 Pizza Thick Crust Shelbyville
A1 Pizza Thick Crust Capital City
A1 Pizza Stuffed Crust Springfield
A1 Pizza Stuffed Crust Shelbyville
A1 Pizza Stuffed Crust Capital City
Elite Pizza Thin Crust Capital City
Elite Pizza Stuffed Crust Capital City
Vincenzo's Pizza Thick Crust Springfield
Vincenzo's Pizza Thick Crust Shelbyville
Vincenzo's Pizza Thin Crust Springfield
Vincenzo's Pizza Thin Crust Shelbyville
Each row indicates that a given restaurant can deliver a given variety of pizza to a given area.
The table has no non-key attributes because its only key is {Restaurant, Pizza Variety, Delivery Area}. Therefore it
meets all normal forms up to BCNF. If we assume, however, that pizza varieties offered by a restaurant are not
affected by delivery area, then it does not meet 4NF. The problem is that the table features two non-trivial
multivalued dependencies on the {Restaurant} attribute (which is not a superkey). The dependencies are:
• {Restaurant} →→ {Pizza Variety}

• {Restaurant} →→ {Delivery Area}
These non-trivial multivalued dependencies on a non-superkey reflect the fact that the varieties of pizza a restaurant
offers are independent from the areas to which the restaurant delivers. This state of affairs leads to redundancy in the
table: for example, we are told three times that A1 Pizza offers Stuffed Crust, and if A1 Pizza starts producing
Cheese Crust pizzas then we will need to add multiple rows, one for each of A1 Pizza's delivery areas. There is,
moreover, nothing to prevent us from doing this incorrectly: we might add Cheese Crust rows for all but one of A1
Pizza's delivery areas, thereby failing to respect the multivalued dependency {Restaurant} →→ {Pizza Variety}.
To eliminate the possibility of these anomalies, we must place the facts about varieties offered into a different table
from the facts about delivery areas, yielding two tables that are both in 4NF:
Varieties By Restaurant
Restaurant Pizza Variety
A1 Pizza Thick Crust
A1 Pizza Stuffed Crust
Elite Pizza Thin Crust
Elite Pizza Stuffed Crust
Vincenzo's Pizza Thick Crust
Vincenzo's Pizza Thin Crust
Delivery Areas By Restaurant

Restaurant Delivery Area
A1 Pizza Springfield
A1 Pizza Shelbyville
A1 Pizza Capital City
Elite Pizza Capital City
Vincenzo's Pizza Springfield
Vincenzo's Pizza Shelbyville
In contrast, if the pizza varieties offered by a restaurant sometimes did legitimately vary from one delivery area to
another, the original three-column table would satisfy 4NF.
Ronald Fagin demonstrated that it is always possible to achieve 4NF.[2] Rissanen's theorem is also applicable on
multivalued dependencies.
4NF in practice
A 1992 paper by Margaret S. Wu notes that the teaching of database normalization typically stops short of 4NF,
perhaps because of a belief that tables violating 4NF (but meeting all lower normal forms) are rarely encountered in
business applications. This belief may not be accurate, however. Wu reports that in a study of forty organizational
databases, over 20% contained one or more tables that violated 4NF while meeting all lower normal forms.[3]
References
[1] "A relation schema R* is in fourth normal form (4NF) if, whenever a nontrivial multivalued dependency X →→ Y holds for R*, then so does
the functional dependency X → A for every column name A of R*. Intuitively all dependencies are the result of keys." Fagin, Ronald
(September 1977). "Multivalued Dependencies and a New Normal Form for Relational Databases" (http:/ / www. almaden. ibm. com/ cs/
people/ fagin/ tods77. pdf). ACM Transactions on Database Systems 2 (1): 267. doi:10.1145/320557.320571. .
[2] Fagin, p. 268
[3] Wu, Margaret S. (March 1992). "The Practical Need for Fourth Normal Form". ACM SIGCSE Bulletin 24 (1): 19–23.
doi:10.1145/135250.134515.
Further reading
• Advanced Normalization (http://www.utexas.edu/its/windows/database/datamodeling/rm/rm8.html) by
ITS, University of Texas.
Fifth normal form

Fifth normal form (5NF), also known as Project-join normal form (PJ/NF) is a level of database normalization
designed to reduce redundancy in relational databases recording multi-valued facts by isolating semantically related
multiple relationships. A table is said to be in the 5NF if and only if every join dependency in it is implied by the
candidate keys.
A join dependency *{A, B, … Z} on R is implied by the candidate key(s) of R if and only if each of A, B, …, Z is a
superkey for R.[1]
Example
Consider the following example:
Travelling Salesman Product Availability By Brand

Travelling Salesman Brand Product Type
Jack Schneider Acme Vacuum Cleaner
Jack Schneider Acme Breadbox
Willy Loman Robusto Pruning Shears
Willy Loman Robusto Vacuum Cleaner
Willy Loman Robusto Breadbox
Willy Loman Robusto Umbrella Stand
Louis Ferguson Robusto Vacuum Cleaner
Louis Ferguson Robusto Telescope
Louis Ferguson Acme Vacuum Cleaner
Louis Ferguson Acme Lava Lamp
Louis Ferguson Nimbus Tie Rack
The table's predicate is: Products of the type designated by Product Type, made by the brand designated by Brand,
are available from the travelling salesman designated by Travelling Salesman.
In the absence of any rules restricting the valid possible combinations of Travelling Salesman, Brand, and Product
Type, the three-attribute table above is necessary in order to model the situation correctly.
Suppose, however, that the following rule applies: A Travelling Salesman has certain Brands and certain Product
Types in his repertoire. If Brand B is in his repertoire, and Product Type P is in his repertoire, then (assuming
Brand B makes Product Type P), the Travelling Salesman must offer products of Product Type P made by Brand B.
In that case, it is possible to split the table into three:
Product Types By Travelling Salesman

Travelling Salesman Product Type
Jack Schneider Vacuum Cleaner
Jack Schneider Breadbox
Willy Loman Pruning Shears
Willy Loman Vacuum Cleaner
Willy Loman Breadbox
Willy Loman Umbrella Stand
Louis Ferguson Telescope
Louis Ferguson Vacuum Cleaner
Louis Ferguson Lava Lamp
Louis Ferguson Tie Rack

Brands By Travelling Salesman

Travelling Salesman Brand
Jack Schneider Acme
Willy Loman Robusto
Louis Ferguson Robusto
Louis Ferguson Acme
Louis Ferguson Nimbus
Product Types By Brand

Brand Product Type
Acme Vacuum Cleaner
Acme Breadbox
Acme Lava Lamp
Robusto Pruning Shears
Robusto Vacuum Cleaner
Robusto Breadbox
Robusto Umbrella Stand
Robusto Telescope
Nimbus Tie Rack
Note how this setup helps to remove redundancy. Suppose that Jack Schneider starts selling Robusto's products. In
the previous setup we would have to add two new entries since Jack Schneider is able to sell two Product Types
covered by Robusto: Breadboxes and Vacuum Cleaners. With the new setup we need only add a single entry (in
Brands By Travelling Salesman).
Usage
Only in rare situations does a 4NF table not conform to 5NF. These are situations in which a complex real-world
constraint governing the valid combinations of attribute values in the 4NF table is not implicit in the structure of that
table. If such a table is not normalized to 5NF, the burden of maintaining the logical consistency of the data within
the table must be carried partly by the application responsible for insertions, deletions, and updates to it; and there is
a heightened risk that the data within the table will become inconsistent. In contrast, the 5NF design excludes the
possibility of such inconsistencies.
References
[1] Analysis of normal forms for anchor-tables (http:/ / www. anchormodeling. com/ tiedostot/ 6nf. pdf)
Further reading
• Advanced Normalization (http://www.utexas.edu/its/windows/database/datamodeling/rm/rm8.html)
Sixth normal form

Sixth normal form (6NF) is a term in relational database theory, used in two different ways.
6NF (C. Date's definition)

A book by Christopher J. Date and others on temporal databases,[1] defined sixth normal form as a normal form for
databases based on an extension of the relational algebra.
In this work, the relational operators, such as join, are generalized to support a natural treatment of interval data,
such as sequences of dates or moments in time.[2] Sixth normal form is then based on this generalized join, as
follows:
A relvar R [table] is in sixth normal form (abbreviated 6NF) if and only if it satisfies no nontrivial join
dependencies at all — where, as before, a join dependency is trivial if and only if at least one of the
projections (possibly U_projections) involved is taken over the set of all attributes of the relvar [table]
concerned.[Date et al.][3]
Any relation in 6NF is also in 5NF.
Sixth normal form is intended to decompose relation variables to irreducible components. Though this may be
relatively unimportant for non-temporal relation variables, it can be important when dealing with temporal variables
or other interval data. For instance, if a relation comprises a supplier's name, status, and city, we may also want to
add temporal data, such as the time during which these values are, or were, valid (e.g., for historical data) but the
three values may vary independently of each other and at different rates. We may, for instance, wish to trace the
history of changes to Status.
For further discussion on Temporal Aggregation in SQL, see also Zimanyi.[4] For a non-relational approach, see
TSQL2.[5]
DKNF
Some authors use the term sixth normal form differently, namely, as a synonym for Domain/key normal form
(DKNF). This usage predates Date et al.'s work.[6]
Usage
The sixth normal form is currently being used in some data warehouses where the benefits outweigh the
drawbacks,[7] for example using Anchor Modeling. Although using 6NF leads to an explosion of tables, modern
databases can prune the tables from select queries (using a process called 'table elimination') where they are not
required. Queries that only access several attributes will then be faster than similar queries in databases modelled in
the Third normal form.
Sixth normal form 28
References
[1] Date et al., 2003
[2] op. cit., chapter 9: Generalizing the relational operators
[3] op. cit., section 10.4, p. 176
[4] Zimanyi 2005
[5] Snodgrass, Richard T. TSQL2 Temporal Query Language (http:/ / www. cs. arizona. edu/ ~rts/ tsql2. html). Describes history, gives
references to standard and original book.
[6] See www.dbdebunk.com for a discussion on this topic (http:/ / www. dbdebunk. com/ page/ page/ 621935. htm)
[7] See the Anchor Modeling website (http:/ / www. anchormodeling. com) for a website that describes a data warehouse modelling method
based on the sixth normal form
Further reading
• Date, C.J. (2006). The relational database dictionary: a comprehensive glossary of relational terms and concepts,
with illustrative examples. O'Reilly Series Pocket references. O'Reilly Media, Inc.. p. 90. ISBN 9780596527983.
• Date, Chris J.; Hugh Darwen, Nikos A. Lorentzos (January 2003). Temporal Data and the Relational Model: A
Detailed Investigation into the Application of Interval and Relation Theory to the Problem of Temporal Database
Management. Oxford: Elsevier LTD. ISBN 1558608559.
• Zimanyi,, E. (June 2006). "Temporal Aggregates and Temporal Universal Quantification in Standard SQL" (http:/
/www.sigmod.org/publications/sigmod-record/0606/sigmod-record.june2006.pdf) (PDF). ACM SIGMOD
Record, volume 35, number 2, page 16. ACM.
• Date, Chris J.. "ON DK/NF NORMAL FORM" (http://www.dbdebunk.com/page/page/621935.htm).
Boyce–Codd normal form

Boyce–Codd normal form (or BCNF or 3.5NF) is a normal form used in database normalization. It is a slightly
stronger version of the third normal form (3NF). A table is in Boyce–Codd normal form if and only if for every one
of its nontrivial dependencies X → Y, X is a superkey—that is, X is either a candidate key or a superset thereof.
BCNF was developed in 1974 by Raymond F. Boyce and Edgar F. Codd to address certain types of anomaly not
dealt with by 3NF as originally defined.[1]
Chris Date has pointed out that a definition of what we now know as BCNF appeared in a paper by Ian Heath in
1971.[2] Date writes:
"Since that definition predated Boyce and Codd's own definition by some three years, it seems to me
that BCNF ought by rights to be called Heath normal form. But it isn't."[3]
3NF tables not meeting BCNF

Only in rare cases does a 3NF table not meet the requirements of BCNF. A 3NF table which does not have multiple
overlapping candidate keys is guaranteed to be in BCNF.[4] Depending on what its functional dependencies are, a
3NF table with two or more overlapping candidate keys may or may not be in BCNF.
An example of a 3NF table that does not meet BCNF is:
Today's Court Bookings

Court Start Time End Time Rate Type
1 09:30 10:30 SAVER
1 11:00 12:00 SAVER
1 14:00 15:30 STANDARD
2 10:00 11:30 PREMIUM-B
2 11:30 13:30 PREMIUM-B
2 15:00 16:30 PREMIUM-A
• Each row in the table represents a court booking at a tennis club that has one hard court (Court 1) and one grass
court (Court 2)
• A booking is defined by its Court and the period for which the Court is reserved
• Additionally, each booking has a Rate Type associated with it. There are four distinct rate types:
• SAVER, for Court 1 bookings made by members
• STANDARD, for Court 1 bookings made by non-members
• PREMIUM-A, for Court 2 bookings made by members
• PREMIUM-B, for Court 2 bookings made by non-members
The table's candidate keys are:
• {Court, Start Time}
• {Court, End Time}
• {Rate Type, Start Time}
• {Rate Type, End Time}
Recall that 2NF prohibits partial functional dependencies of non-prime attributes on candidate keys, and that 3NF
prohibits transitive functional dependencies of non-prime attributes on candidate keys. In the Today's Court
Bookings table, there are no non-prime attributes: that is, all attributes belong to candidate keys. Therefore the table
adheres to both 2NF and 3NF.
The table does not adhere to BCNF. This is because of the dependency Rate Type → Court, in which the
determining attribute (Rate Type) is neither a candidate key nor a superset of a candidate key.
Dependency Rate Type → Court is respected as a Rate Type should only ever apply to a single Court.
The design can be amended so that it meets BCNF:
Rate Types
Rate Type Court Member Flag
SAVER 1 Yes
STANDARD 1 No
PREMIUM-A 2 Yes
PREMIUM-B 2 No
Today's Bookings
Rate Type Start Time End Time
SAVER 09:30 10:30
SAVER 11:00 12:00
STANDARD 14:00 15:30
PREMIUM-B 10:00 11:30
PREMIUM-B 11:30 13:30
PREMIUM-A 15:00 16:30
The candidate keys for the Rate Types table are {Rate Type} and {Court, Member Flag}; the candidate keys for the
Today's Bookings table are {Rate Type, Start Time} and {Rate Type, End Time}. Both tables are in BCNF. Having
one Rate Type associated with two different Courts is now impossible, so the anomaly affecting the original table
has been eliminated.
Achievability of BCNF
In some cases, a non-BCNF table cannot be decomposed into tables that satisfy BCNF and preserve the
dependencies that held in the original table. Beeri and Bernstein showed in 1979 that, for example, a set of functional
dependencies {AB → C, C → B} cannot be represented by a BCNF schema.[5] Thus, unlike the first three normal
forms, BCNF is not always achievable.
Consider the following non-BCNF table whose functional dependencies follow the {AB → C, C → B} pattern:
Nearest Shops
Person Shop Type Nearest Shop
Davidson Optician Eagle Eye
Davidson Hairdresser Snippets
Wright Bookshop Merlin Books
Fuller Bakery Doughy's
Fuller Hairdresser Sweeney Todd's
Fuller Optician Eagle Eye
For each Person / Shop Type combination, the table tells us which shop of this type is geographically nearest to the
person's home. We assume for simplicity that a single shop cannot be of more than one type.
The candidate keys of the table are:
• {Person, Shop Type}
• {Person, Nearest Shop}
Because all three attributes are prime attributes (i.e. belong to candidate keys), the table is in 3NF. The table is not in
BCNF, however, as the Shop Type attribute is functionally dependent on a non-superkey: Nearest Shop.
The violation of BCNF means that the table is subject to anomalies. For example, Eagle Eye might have its Shop
Type changed to "Optometrist" on its "Fuller" record while retaining the Shop Type "Optician" on its "Davidson"
record. This would imply contradictory answers to the question: "What is Eagle Eye's Shop Type?" Holding each
shop's Shop Type only once would seem preferable, as doing so would prevent such anomalies from occurring:
Shop Near Person

Person Shop
Davidson Eagle Eye
Davidson Snippets
Wright Merlin Books
Fuller Doughy's
Fuller Sweeney Todd's
Fuller Eagle Eye
Shop
Shop Shop Type
Eagle Eye Optician
Snippets Hairdresser
Merlin Books Bookshop
Doughy's Bakery
Sweeney Todd's Hairdresser
In this revised design , the "Shop Near Person" table has a candidate key of {Person, Shop}, and the "Shop" table has
a candidate key of {Shop}. Unfortunately, although this design adheres to BCNF, it is unacceptable on different
grounds: it allows us to record multiple shops of the same type against the same person. In other words, its candidate
keys do not guarantee that the functional dependency {Person, Shop Type} → {Shop} will be respected.
A design that eliminates all of these anomalies (but does not conform to BCNF) is possible.[6] This design consists of
the original "Nearest Shops" table supplemented by the "Shop" table described above.
Nearest Shops
Person Shop Type Nearest Shop
Davidson Optician Eagle Eye
Davidson Hairdresser Snippets
Wright Bookshop Merlin Books
Fuller Bakery Doughy's
Fuller Hairdresser Sweeney Todd's
Fuller Optician Eagle Eye

Shop
Shop Shop Type
Eagle Eye Optician
Snippets Hairdresser
Merlin Books Bookshop
Doughy's Bakery
Sweeney Todd's Hairdresser
If a referential integrity constraint is defined to the effect that {Shop Type, Nearest Shop} from the first table must
refer to a {Shop Type, Shop} from the second table, then the data anomalies described previously are prevented.
References
[1] Codd, E. F. "Recent Investigations into Relational Data Base Systems." IBM Research Report RJ1385 (April 23rd, 1974). Republished in
Proc. 1974 Congress (Stockholm, Sweden, 1974). New York, N.Y.: North-Holland (1974).
[2] Heath, I. "Unacceptable File Operations in a Relational Database." Proc. 1971 ACM SIGFIDET Workshop on Data Description, Access, and
Control, San Diego, Calif. (November 11th–12th, 1971).
[3] Date, C.J. Database in Depth: Relational Theory for Practitioners. O'Reilly (2005), p. 142.
[4] Vincent, M.W. and B. Srinivasan. "A Note on Relation Schemes Which Are in 3NF But Not in BCNF." Information Processing Letters
48(6), 1993, pp. 281–83.
[5] Beeri, Catriel and Bernstein, Philip A. "Computational problems related to the design of normal form relational schemas." ACM Transactions
on Database Systems 4(1), March 1979, p. 50.
September 1982, pp. 493.
Bibliography
• Date, C. J. (1999). An Introduction to Database Systems (8th ed.). Addison-Wesley Longman. ISBN
0-321-19784-4.
External links
• Rules Of Data Normalization (http://web.archive.org/web/20080805014412/http://www.datamodel.org/
NormalizationRules.html)
• Advanced Normalization (http://web.archive.org/web/20080423014733/http://www.utexas.edu/its/
archive/windows/database/datamodeling/rm/rm8.html) by ITS, University of Texas.
Article Sources and Contributors 33
Article Sources and Contributors

Database normalization Source: http://en.wikipedia.org/w/index.php?oldid=418888925 Contributors: 1exec1, ARPIT SRIVASTAV, Ahoerstemeier, Akamad, Akhristov, Alai, Alasdair, Alest,
Alpha 4615, Amr40, AndrewWTaylor, Antonielly, Anwar saadat, Apapadop, Arakunem, Archer3, Arcturus, Ascend, AstroWiki, AubreyEllenShomo, Autocracy, AutumnSnow, Azhar600-1,
BMF81, Babbling.Brook, Bernard François, Bewildebeast, Billben74, Billpennock, BillyPreset, Black Eagle, Blade44, Blakewest, Blanchardb, Bloodshedder, Blowdart, BlueNovember,
BlueWanderer, Bongwarrior, Boson, Bovineone, BradBeattie, Brick Thrower, BrokenSegue, Bruceshining, Bschmidt, Bugsbunny1611, BuzCo, CLW, Can't sleep, clown will eat me, Chairboy,
Chrislk02, Citral, CodeNaked, Conversion script, Creature, Crenner, DARTH SIDIOUS 2, Damian Yerrick, DanMS, Dancraggs, Danlev, Datasmid, David Colbourn, DavidConrad,
DavidHOzAu, Davidhorman, Decrease789, Demosta, DerHexer, Dfass, Dflock, Discospinster, Doc vogt, DocRuby, Docu, Doud101, Dreftymac, Drowne, Dthomsen8, Duke Ganote, Edward Z.
Yang, Eghanvat, Elcool83, Elwikipedista, EmmetCaulfield, Emperorbma, Encognito, Enric Naval, Epepke, Eric Burnett, Escape Orbit, Ethan, Evilyuffie, Falcon8765, Farquaadhnchmn,
Fathergod, FauxFaux, Fieldday-sunday, Fireman biff, Flewellyn, Fmjohnson, Fraggle81, Fred Bradstadt, Furrykef, Gadfium, GateKeeper, Gimboid13, Gk5885, Gogo Dodo, Gottabekd, Gregbard,
GregorB, Groganus, Gustavb, Guybrush, Gwr2004, Hadal, Hanifbbz, Hapsiainen, HbJ, Hbf, Heracles31, HiDrNick, History2007, Hoo man, Hu12, Hydrogen Iodide, Ianblanes, IceUnshattered,
Imre Fabian, Inquam, Intgr, JamesBWatson, Jamesday, Jamesjusty, Jan Hidders, Japo, Javert16, Jdlambert, Jgro, Jjjjjjjjjj, Jklin, Joness59, Joseph Dwayne, Jpatokal, Jpo, Justin W Smith,
KAtremer, KathrynLybarger, Keane2007, Keegan, KevinOwen, KeyStroke, Keyvez, Kgwikipedian, Kingpin13, Klausness, L Kensington, L'Aquatique, LOL, Larsinio, Lawrence Cohen,
Leandrod, Lee J Haywood, Legless the oaf, Leleutd, Leotohill, Lerdthenerd, Les boys, Lethe, Libcub, Lifeweaver, Linhvn88, LittleOldMe, Longhair, Lssilva, Lujianxiong, Lulu of the
Lotus-Eaters, Lumingz, Luna Santin, M4gnum0n, MER-C, Magantygk, Manavkataria, Mark Renier, Marknew, MarownIOM, MartinHarper, Masterstupid, Materialscientist, Matmota, Matthew
1130, Mckaysalisbury, Metaeducation, Michael Hardy, Michalis Famelis, Microtony, Mike Rosoft, Mikeblas, Mikeo, Mindmatrix, Miss Madeline, Mjhorrell, Mo0, Modeha, Mooredc, Mpd, Mr
Stephen, MrDarcy, Nabav, NawlinWiki, Nick1nildram, NickCT, NoahWolfe, Noisy, Nsaa, NubKnacker, Obradovic Goran, Ocrow, OliverMay, Opes, Oxymoron83, Pagh, Peachey88, Pearle,
Perfectblue97, Pete142, Pharaoh of the Wizards, Phil Boswell, Pie Man 360, Plastic rat, Polluxian, ProveIt, Purplepiano, Quarl, RB972, RadioFan, Railgun, Rdsmith4, Rdummarf, Reedy,
Regancy42, Reinyday, Remy B, Reofi, RichF, Rjwilmsi, Robert McClenon, Robomaeyhem, Rockcool19, Rodasmith, Romke, Ronfagin, Rp, Rumplefish, Ruud Koot, Ryulong, Sam Hocevar,
ScottJ, Scwlong, Seaphoto, Sfnhltb, Shadowjams, Shawn wiki, Shreyasjoshis, Shyamal, Silpi, Simeon, Simetrical, Sixpence, Skritek, Smjg, Smurfix, Snezzy, Snigbrook, Sonett72, Soulpatch,
Soumyasch, Spacesoon, Sstrader, Stannered, Starwiz, Stephen e nelson, SteveHL, Stifle, Stolkin, Sue Rangell, Superjaws, Sydneyw, Sylvain Mielot, Szathmar, Taw, Tbhotch, Tcamacho,
Tedickey, Teknic, Tgantos, The Thing That Should Not Be, The undertow, The1physicist, Tide rolls, Titofhr, Tobias Bergemann, Toddst1, Tom Lougheed, Tommy2010, Toxicwaste288, Traxs7,
Troels Arvin, Turnstep, Twinney12, Tyc20, Unforgettableid, Upholder, Utcursch, Vald, Valdor65, VinceBowdren, Vladsinger, Vodak, Voidxor, Waggers, Wakimakirolls, Wexcan,
WikipedianYknOK, Wildheat, Wilfordbrimley, Wilsondavidc, Winterst, Wjhonson, Woohookitty, WookieInHeat, Xiong Chiamiov, Xiroth, Yong-Yeol Ahn, Zedla, Zeyn1, Zhenqinli, Zzuuzz,
1147 anonymous edits
First normal form Source: http://en.wikipedia.org/w/index.php?oldid=417434759 Contributors: Aeonx, Alansohn, Alxndr, Ambuj.Saxena, Bernard Ladenthin, BillyPreset, Boson, Brianga,
Brick Thrower, Burner0718, Closedmouth, Davidhorman, Dfass, Dreftymac, Eallik, Ebraminio, Eibcga, General Wesc, GermanX, GregorB, Gwernol, Hamidrizeh, Heathcliff, Isnow, Jacobolus,
Jason Quinn, Jgzheng, Kwetal, LarRan, Lordmwesh, M.r santosh kumar., Mfpinhal, Montchav, Mystagogue, Nabav, NawlinWiki, ReformatMe, Vegpuff, VictorAnyakin, VinceBowdren, 파핀,
139 anonymous edits
Second normal form Source: http://en.wikipedia.org/w/index.php?oldid=417759225 Contributors: Ak786, Apugazh, Benjamin.Cramphorn, Bernard Ladenthin, Boson, Btilm, Carlhoerberg,
Chrislk02, DARTH SIDIOUS 2, DVdm, ESkog, Ebraminio, GermanX, GregorB, Haffasoul, Ijliao, Jason Quinn, Javert16, JianzhouZhou, JimpsEd, Mordashov, Nabav, Sanchitideas,
Shreyasjoshis, SqlPac, Uncle Dick, VinceBowdren, 파핀, 59 anonymous edits
Third normal form Source: http://en.wikipedia.org/w/index.php?oldid=414029712 Contributors: Alvin-cs, Amalthea, Anabus, Arcturus, Azrich, Bernard Ladenthin, Blahma, Boson, Bxn1358,
CapitalR, Centrx, Codeculturist, DVdm, Dorfl, Ebraminio, Edward Z. Yang, Furrykef, GermanX, Gingerjoos, Ijliao, Jason Quinn, Jcsalterego, Jitse Niesen, Joseph Dwayne, Jswhitten,
Kitkatbeard, Leasabp, MsHyde, Nabav, Natural Cut, Ollie, Pinethicket, Semaphorite, Shreyasjoshis, Sleske, Someusername222, THEN WHO WAS PHONE?, Thingg, Toyota prius 2, Unara,
Vegpuff, VinceBowdren, Vlad2000Plus, Wikimiro, Willking1979, Wyadbb, 80 anonymous edits
Fourth normal form Source: http://en.wikipedia.org/w/index.php?oldid=412588144 Contributors: Akerans, Bernard Ladenthin, Britannica, Ebraminio, Fetchcomms, Geeoharee, GermanX,
Jason Quinn, Jmabel, Nabav, Patrick, Savh, Selfworm, VinceBowdren, Vjosullivan, WikHead, Winterst, 25 anonymous edits
Fifth normal form Source: http://en.wikipedia.org/w/index.php?oldid=414829532 Contributors: Andy M. Wang, Bernard Ladenthin, Brick Thrower, Cool Blue, Dugo, Ebraminio, FineganCJ,
GermanX, Jason Quinn, Libcub, MarcosWozniak, Nabav, Quarl, RonaldKunenborg, Siryendor, SqlPac, Stamfest, Systemparadox, VinceBowdren, 31 anonymous edits
Sixth normal form Source: http://en.wikipedia.org/w/index.php?oldid=414422499 Contributors: Boson, DePiep, Emurphy42, Esran, Favonian, GregorB, Jason Quinn, Nabav, Quarl,
Roenbaeck, RonaldKunenborg, Rp, 9 anonymous edits
Boyce–Codd normal form Source: http://en.wikipedia.org/w/index.php?oldid=411396753 Contributors: Anugrah atreya, Bernard Ladenthin, Briangregory2000, Chitransh saxena,
CiudadanoGlobal, Ebraminio, Eggman64, Fctseng, Fieldday-sunday, JForget, Jgzheng, JimpsEd, Leflyman, Mikeblas, Nabav, Nay Min Thu, NeerajKawathekar, Niddriesteve, Njsg, Obradovic
Goran, Oxymoron83, Quarl, Raztus, Simetrical, Smurfix, Solomon423, SqlPac, Su30, Torzsmokus, Uzume, VinceBowdren, Yachtsman1, ZenSaohu, 62 anonymous edits
Image Sources, Licenses and Contributors 34
Image Sources, Licenses and Contributors

File:Update anomaly.png Source: http://en.wikipedia.org/w/index.php?title=File:Update_anomaly.png License: Public Domain Contributors: Original uploader was Nabav at en.wikipedia
File:Insertion anomaly.svg Source: http://en.wikipedia.org/w/index.php?title=File:Insertion_anomaly.svg License: unknown Contributors: User:Stannered
File:Deletion anomaly.svg Source: http://en.wikipedia.org/w/index.php?title=File:Deletion_anomaly.svg License: unknown Contributors: User:Stannered
License 35
License
Creative Commons Attribution-Share Alike 3.0 Unported
http:/ / creativecommons. org/ licenses/ by-sa/ 3. 0/

Database Normal Is at Ion

Enviado por

Dados do documento

Descrição original:

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Database Normal Is at Ion

Enviado por

Direitos autorais:

Formatos disponíveis

Database normalisation

Free the database of modification anomalies

• There are circumstances in which certain

Minimize redesign when extending the database structure

Make the data model more informative to users

Avoid bias towards any particular pattern of querying

Tr. ID Date Amount

12890 14-Oct-2003 −87

12904 15-Oct-2003 −50

Tr. ID Date Amount

12898 14-Oct-2003 −21

Tr. ID Date Amount

12907 15-Oct-2003 −18

14920 20-Nov-2003 −70

15003 27-Nov-2003 −60

Customer Tr. ID Date Amount

Jones 12890 14-Oct-2003 −87

Jones 12904 15-Oct-2003 −50

Wilkins 12898 14-Oct-2003 −21

Stevens 12907 15-Oct-2003 −18

Stevens 14920 20-Nov-2003 −70

Stevens 15003 27-Nov-2003 −60

Background to normalization: definitions

Normal form Defined by Brief definition

Second normal form [2] No non-prime attribute in the table is functionally

Third normal form [2] Every non-prime attribute is non-transitively dependent on

Boyce–Codd normal [13] Every non-trivial functional dependency in the table is a

Domain/key normal [16] Every constraint on the table is a logical consequence of

Non-first normal form (NF² or N1NF)

First Normal Form

Non-First Normal Form

Notes and references

First normal form

1NF tables as representations of relations

Domains and values

123 Robert Ingram 555-861-2025

456 Jane Wright 555-403-1659

789 Maria Fernandez 555-808-9633

123 Robert Ingram 555-861-2025

456 Jane Wright 555-403-1659

789 Maria Fernandez 555-808-9633

Repeating groups across columns

123 Robert Ingram 555-861-2025

456 Jane Wright 555-403-1659 555-776-4100 555-403-1659

789 Maria Fernandez 555-808-9633

Repeating groups within columns

123 Robert Ingram 555-861-2025

456 Jane Wright 555-403-1659, 555-776-4100

789 Maria Fernandez 555-808-9633

A design that complies with 1NF

123 Robert Ingram

456 Jane Wright

789 Maria Fernandez

Customer Telephone Number

Normalization beyond 1NF

Subscriber Email Addresses

108 steve@aardvarkmail.net Steve Wallace

252 carol@mongoosemail.org Carol Robertson

252 crobertson@aardvarkmail.net Carol Robertson

360 hclark@antelopemail.com Harriet Clark

The table's key is {Subscriber ID, Email Address}.

Second normal form