Normalized vs. Denormalized: Normalization

Normalized vs.
Denormalized
Normalization:

Normalization is the process of efficiently organizing data in a database. There are
two goals of the normalization process: eliminating redundant data (for example,
storing the same data in more than one table) and ensuring data dependencies
make sense (only storing related data in a table). Both of these are worthy goals as
they reduce the amount of space a database consumes and ensure that data is
logically stored.

Denormalization:

Denormalization is the process of attempting to optimise the read performance of a
database by adding redundant data or by grouping data. In some cases,
denormalisation helps cover up the inefficiencies inherent in relational database
software. A relational normalised database imposes a heavy access load over
physical storage of data even if it is well tuned for high performance.
Normalization: The first three forms

First Normal Form:
No repeating groups. As an example, it might be tempting to make an invoice table
with columns for the first, second, and third line item (see above). This violates the
first normal form, and would result in large rows, wasted space (where an invoice
had less than the maximum number of line items), and *horrible* SQL statements
with a separate join for each repetition of the column. First form normalization
requires you make a separate line item table, with it's own key (in this case the
combination of invoice number and line number) (See below).

Second Normal Form:
Each column must depend on the *entire* primary key. As an example, the customer information
could be put in the line item table (see above). The trouble with that is that the customer goes with the
invoice, not with each line on the invoice. Putting customer information in the line item table will cause
redundant data, with it's inherant overhead and difficult modifications. Second form normalization
requires you place the customer information in the invoice table (see below).

Third Normal Form:
Each column must depend on *directly* on the primary key. As an example, the customer address
could go in the invoice table (see above), but this would cause data redundancy if several invoices
were for the same customer. It would also cause an update nightmare when the customer changes
his address, and would require extensive programming to insert the address every time an existing
customer gets a new invoice. Third form normalization requires the customer address go in a
separate customer table with its own key (customer), with only the customer identifier in the invoice
table (see below).

What is the difference between Normalized & denormalized, and when do you
use which?
When most of us design our databases we tend to think of them as related (and un-
related) tables that contain data. It is common to normalize our tables in order to
create a cleaner, more manageable, and intuative Database design. Normalized
means separate tables via foreign key relationship, there are however things that we
should consider before we normalize our system:
Historical Data:
One of the most important is historical data. When we normalize our system we open
the door for historically inaccuracy. What's that mean? It means that we can change
the data in the related table, causing us to lose the accuracy of the data thats related
to it.

What do you mean "lose the accuracy"? Lets look at an example: We have a
Member table which holds a foreign key referrence to the States table. If we change
the data in States table, which is referrenced by the Member table, we have no way
of knowing what that the Member record originally contained (like change Republic
of Congo to Republic of China).

So why should we care, isn't this the whole reason why we normalize our database?
Yes and No. If we know that we will never have the need to keep an audit trail or
report on the data then surely normalizing it is perfectly suitable. However we all
know that software requirment change, and there may come a time when the
Marketing Department wants to know the history of how its members data has
changed over time, normalization would destroy this historical record (unless we
some how dig it out of a backup, which would be a major pain in the rear),
denormalized does not.
Reporting:
If you think that you will ever have to do any kind of Data Mining or Reporting on
your data then avoiding over-normalization is important. Reporting on, or performing
calculations on, large amounts of data requires that you denormalize your database.
Anticipating this during the design phases can save you a lot of time and grief later
on.

Normalized vs. Denormalized: Normalization

Enviado por

Dados do documento

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Normalized vs. Denormalized: Normalization

Enviado por

Direitos autorais:

Formatos disponíveis

Normalized vs.

Você também pode gostar