Você está na página 1de 13

22/07/2010

Database and Information Retrieval


ICT118 Lecture 2 Relational Theory Entities and Normalisation

22 July, 2010

ICT118 Database and Information Management, Sem 2, 2010. Includes some Cengage material

2. 1

Database Design
Storage of business information in a database must be planned Current database design is based on Codds relational theory Business need entity identification ERD modelling normalisation db design

22 July, 2010

ICT118 Database and Information Management, Sem 2, 2010. Includes some Cengage material

2. 2

Entities
Identifiable things of interest to the business Entities do something or record something Entities interact with other entities in the business interaction implies processing Entities have attributes

22 July, 2010

ICT118 Database and Information Management, Sem 2, 2010. Includes some Cengage material

2. 3

22/07/2010

EntityEntity-Relationship Model
Used to depict the relationship that exists among entities The following relationship types can be Eincluded in an E-R model: One-to-one One-to-many Many-to-many
22 July, 2010 ICT118 Database and Information Management, Sem 2, 2010. Includes some Cengage material 2. 4

E-R Model Notation Examples


Subject Examination Student Enrolment

One to one

One to many

Student

Subject

Many to many
22 July, 2010 ICT118 Database and Information Management, Sem 2, 2010. Includes some Cengage material 2. 5

Notation conventions
Cardinality is the term which indicates the number of required instances at each end of a relationship One instance Many instances

22 July, 2010

ICT118 Database and Information Management, Sem 2, 2010. Includes some Cengage material

2. 6

22/07/2010

Notation conventions
Optionality indicates whether or not an instance of an entity must exist in the relationship, or is optional. Numeric ranges can be specified Optional
0..*

Mandatory
22 July, 2010 ICT118 Database and Information Management, Sem 2, 2010. Includes some Cengage material 2. 7

One-toOne-to-One Relationship
A record in one entity is related to only one record in the other entity Example: Each aircraft passenger is allocated one seat, and each seat is allocated to just one passenger

PASSENGER

SEAT

22 July, 2010

ICT118 Database and Information Management, Sem 2, 2010. Includes some Cengage material

2. 8

One-toOne-to-Many Relationship
Each record in one entity can be related to one or more records in the other entity Example: A class has only one lecturer, but each lecturer can teach many classes

LECTURER

1..4

CLASS

22 July, 2010

ICT118 Database and Information Management, Sem 2, 2010. Includes some Cengage material

2. 9

22/07/2010

Many-toMany-to-Many Relationship
Each record in an entity can be related to multiple occurrences in the other entity Example: A student can take many classes, and each class is composed of many students
1..5

STUDENT
10..25

CLASS

22 July, 2010

ICT118 Database and Information Management, Sem 2, 2010. Includes some Cengage material

2. 10

EntityEntity-Relationship Diagram (ERD)


A diagrammatic representation of the relationships between all entities of interest in a business situation Forms the basis for building a database after the process of normalisation

22 July, 2010

ICT118 Database and Information Management, Sem 2, 2010. Includes some Cengage material

2. 11

EJustLee Example E-R Model

Figure 1-4
22 July, 2010

An E-R model for JustLee Books


ICT118 Database and Information Management, Sem 2, 2010. Includes some Cengage material 2. 12

22/07/2010

Attributes
An individual field that describes one aspect of the entity
A name An identifier A date A measurement A reference Size12 100mm bolt 0844255 04-MAY-2010 328 0844320

One or more attributes will uniquely identify each instance of the entity (the key)
22 July, 2010 ICT118 Database and Information Management, Sem 2, 2010. Includes some Cengage material 2. 13

Attributes
An entity will be composed of (described by) a collection of attributes which taken together represent that entity
Name Description Size Classification Quantity

The values of each attribute distinguish one instance of an entity from another
22 July, 2010 ICT118 Database and Information Management, Sem 2, 2010. Includes some Cengage material 2. 14

Attribute sharing
One particular attribute may be associated with several entities Product identifier in:
PRODUCT entity ORDER entity DELIVERY entity

22 July, 2010

ICT118 Database and Information Management, Sem 2, 2010. Includes some Cengage material

2. 15

22/07/2010

Functional Dependency
The concept that one thing depends on the value of another, (or that one thing is determined by the value of another) e.g. I need milk and it costs $1.50 per litre. How much do I pay for milk? 2 litres = $3.00, 5 litres = $7.50 The amount I pay is determined by (dependent on) the number of litres Quantity is the determinant of amount paid Amount paid is functionally dependent on quantity
22 July, 2010 ICT118 Database and Information Management, Sem 2, 2010. Includes some Cengage material 2. 16

Determinants and Relations


Relational theory requires one determinant only for a relation (table) A determinant is called a key, and there may be several keys in the entities that we find. A key or determinant is said to Identify a relation The Normalisation process modifies our entities into relations having a single determinant unOne un-normalised entity will probably produce several normalised relations
22 July, 2010 ICT118 Database and Information Management, Sem 2, 2010. Includes some Cengage material 2. 17

Functional dependency and keys


Example from the text book. Where the value of an attribute can only have one value for each value of its key attribute Publication date can only have one value for each value of the key attribute ISBN

22 July, 2010

ICT118 Database and Information Management, Sem 2, 2010. Includes some Cengage material

2. 18

22/07/2010

Notation conventions
Entities are recorded as sequential lists of attributes within brackets The primary identifying attribute(s) or identifier, or key, or determinant is underlined
Product (Product-code, Product-descr, Supplieridentifier, Quantity-on-hand, UOM, Reorder-qty, Cost, Retail, Product-family)

22 July, 2010

ICT118 Database and Information Management, Sem 2, 2010. Includes some Cengage material

2. 19

Notation conventions
Some groups of attributes occur several times in any one instance of an entity Multiple subject results for one student
Student (Student-num, Course-code, Course-type, Enrol-date, Finish-date, Graduation-date (Subject-code, subject-name, semester, year, score, grade), term-address, home-address, Enrolment-type)

Follow this format convention for defining entities


22 July, 2010 ICT118 Database and Information Management, Sem 2, 2010. Includes some Cengage material 2. 20

Storing Entities in a database


Entities are stored in tables that conform to Codds relational theory Table (or Relation) is a collection of rows of attributes that must have one value only in each attribute Every row must have the same structure (sequence of attributes) Therefore a table resembles a regular matrix Normalisation produces correctly structured tables
22 July, 2010 ICT118 Database and Information Management, Sem 2, 2010. Includes some Cengage material 2. 21

22/07/2010

Entity Normalisation
Creates entities where all attributes are functionally dependent on the key attribute(s) unEliminates the anomalies associated with unnormalised entities Minimises the number of entities in which a given attribute must be stored multiNormalisation is a multi-step process that you perform on every entity Six normal forms (NF) exist. We focus on 4 only: BoyceFirst NF, Second NF, Third NF, Boyce-Codd NF
22 July, 2010 ICT118 Database and Information Management, Sem 2, 2010. Includes some Cengage material 2. 22

Definitions [1]
1NF: if, and only if, all underlying attribute values are atomic (i.e. cannot be meaningfully decomposed) non2NF: if, and only if, it is in 1NF and every nonkey attribute is fully dependent on the primary key non3NF: if, and only if, it is in 2NF and every nonnonkey attribute is non-transitively dependent on the primary key
[1] C.J. Date, An Introduction to Database Systems, 5th edn, Vol.1. Reading MA: Addison-Wesley, 1990, pp532-557.
22 July, 2010 ICT118 Database and Information Management, Sem 2, 2010. Includes some Cengage material 2. 23

Definitions
BoyceBoyce-Codd Normal Form (BCNF) is a variant of 3NF for special cases: 1. when there are multiple candidate keys 2. those candidate keys are composite keys 3. those candidate keys overlapped These cases are less common and we will restrict our interest to the first three normal forms
22 July, 2010 ICT118 Database and Information Management, Sem 2, 2010. Includes some Cengage material 2. 24

22/07/2010

Normalisation Process
Identify the attributes for each entity. These unare the un-normalised entities (UNF) Identify keys (uniquely identifying attributes) nonIdentify then resolve non-atomic attributes (repeating groups) (1NF) Identify then resolve partial dependencies (2NF) Identify then resolve transitive dependencies (3NF)
22 July, 2010 ICT118 Database and Information Management, Sem 2, 2010. Includes some Cengage material 2. 25

Normalisation Process
Select an entity Identify its attributes (ignore calculated fields) Do any of these recur as a repeating group?
For any single instance of the entity are there repetitions of groups of attributes? Student: Identifier Does it Repeat? no Names Does it Repeat? no Address Does it Repeat? no Subjects Does it Repeat? yes
22 July, 2010 ICT118 Database and Information Management, Sem 2, 2010. Includes some Cengage material 2. 26

Normalisation process
Document using the conventional format (underlined underlined) Primary keys are identified (underlined) Repeating groups are removed into a new entity, along with the primary key

(Student num, CourseStudentCourseStudent (Student-num, Course-code, CourseEnrolFinishGraduatetype, Enrol-date, Finish-date, Graduate-date Subject-code, subject(Subject-code, subject-name, semester, termhomeyear, score, grade), term-address, homeEnrolmentaddress, Enrolment-type) becomes:
22 July, 2010 ICT118 Database and Information Management, Sem 2, 2010. Includes some Cengage material 2. 27

22/07/2010

Normalisation Process
(Student num, CourseStudentCourseStudent (Student-num, Course-code, CourseEnrolFinishGraduatetype, Enrol-date, Finish-date, Graduate-date, TermHomeEnrolmentTerm-address, Home-address, Enrolmenttype) plus (Student num, Subject-code, StudentEnrolment (Student-num, Subject-code, subjectsubject-name, semester, year, score, grade) Both are in First Normal Form (1NF)
22 July, 2010 ICT118 Database and Information Management, Sem 2, 2010. Includes some Cengage material 2. 28

unAnomalies from un-normalised data

(Student num, Subject-code, StudentEnrolment (Student-num, Subject-code, subjectsubject-name, semester, year, score, grade) Delete anomaly can lose student data when a subject record is deleted Insert anomaly require information about multiple entities when wanting to add just one Update anomaly one column value change requires repetition of the change for every record
ICT118 Database and Information Management, Sem 2, 2010. Includes some Cengage material 2. 29

22 July, 2010

Normalisation Process
Retain every attribute from the UNF collection in the new 1NF collections Do not invent/create new attributes unRemember to copy the key of the original unnormalised entity into the new entity formed from the Repeating group of attributes If there is NO Repeating group then the original entity is already in First Normal Form
22 July, 2010 ICT118 Database and Information Management, Sem 2, 2010. Includes some Cengage material 2. 30

10

22/07/2010

1NF to 2NF process


1NF entities with a single attribute as key are already in 2NF ! The process only applies to 1NF entities with a compound key (2 or more attributes) Enrolment (Student-num, Subject-code, (Student-num, Subject- code, Student subjectsubject-name, semester, year, score, grade)

22 July, 2010

ICT118 Database and Information Management, Sem 2, 2010. Includes some Cengage material

2. 31

1NF to 2NF Process


Check if any attributes are functionally dependent on only part of the compound key (Student num, Subject-code, StudentEnrolment (Student-num, Subject-code, subjectsubject-name, semester, year, score, grade) SubjectSubject-name and semester are functionally dependent only on Subject-code because, if SubjectStudentsubjectonly Student-num changed, subject-name and semester would NOT change
22 July, 2010 ICT118 Database and Information Management, Sem 2, 2010. Includes some Cengage material 2. 32

1NF to 2NF Process


(Student num, Subject-code, StudentEnrolment (Student-num, Subject-code, subjectsubject-name, semester, year, score, grade) Becomes: (Student num, Subject-code, StudentEnrolment (Student-num, Subject-code, year, score, grade) Plus (Subject code, subjectSubjectSubject (Subject-code, subject-name, semester) All in second normal form (2NF)
22 July, 2010 ICT118 Database and Information Management, Sem 2, 2010. Includes some Cengage material 2. 33

11

22/07/2010

Current entity status


(Student num, CourseStudentCourseStudent (Student-num, Course-code, CourseEnrolFinishGraduatetype, Enrol-date, Finish-date, Graduate-date, TermHomeEnrolmentTerm-address, Home-address, Enrolmenttype) (Student num, Subject-code, StudentEnrolment (Student-num, Subject-code, year, score, grade) (Subject code, subjectSubjectSubject (Subject-code, subject-name, semester) All in second normal form (2NF)
22 July, 2010 ICT118 Database and Information Management, Sem 2, 2010. Includes some Cengage material 2. 34

2NF to 3NF Process


Check if any attributes are functionally nondependent on any other non-key attributes CourseCourse-type is transitively dependent on CourseCourseCourse-code because, if Course-code Coursechanges, so will Course-type (Student num, CourseStudentCourseStudent (Student-num, Course-code, CourseEnrolFinishGraduatetype, Enrol-date, Finish-date, Graduate-date, TermHomeEnrolmentTerm-address, Home-address, Enrolmenttype)
22 July, 2010 ICT118 Database and Information Management, Sem 2, 2010. Includes some Cengage material 2. 35

2NF to 3NF Process


Student becomes: (Student num, CourseStudentEnrolStudent (Student-num, Course-code, EnrolFinishGraduateTermdate, Finish-date, Graduate-date, TermHomeEnrolmentaddress, Home-address, Enrolment-type) Plus: (Course code, CourseCourseCourse (Course-code, Course-type) Both in third normal form (3NF) Note that CourseCourse-code remains in Student. Never delete a key.
22 July, 2010 ICT118 Database and Information Management, Sem 2, 2010. Includes some Cengage material 2. 36

12

22/07/2010

Final Result of Normalisation


(Student num, CourseStudentEnrolStudent (Student-num, Course-code, EnrolFinishGraduateTermdate, Finish-date, Graduate-date, TermHomeEnrolmentaddress, Home-address, Enrolment-type) (Course code, CourseCourseCourse (Course-code, Course-type) (Student num, Subject-code, StudentEnrolment (Student-num, Subject-code, year, score, grade) (Subject code, subjectSubjectSubject (Subject-code, subject-name, semester) All in third normal form (3NF)
22 July, 2010 ICT118 Database and Information Management, Sem 2, 2010. Includes some Cengage material 2. 37

13

Você também pode gostar