Você está na página 1de 25

| |||


 | | | 

 ! 

1 © 2007 Progress Software Corporation

›   ›  
 

    


Wizard, Progress Software Corporation Manager, US-East, Solution Engineer
Progress Software Corporation

Rules are made to be broken

To every rule,
there is an exception!

3 © 2007 Progress Software Corporation

 
|
 


 !"#

| |||

 | | | 

 ! 

If you thought this talk was going to be about


indexing «

It isn¶t. Nor is it about performance.

4 © 2007 Progress Software Corporation

Topics

- Theory:
‡ What is Database Design
‡ Basic Elements
‡ Representing the Model as Tables
- Practice
‡ An Example
- Some Other Topics

5 © 2007 Progress Software Corporation

First, a little theory

6 © 2007 Progress Software Corporation

 
|
 


 !"#

| |||

 | | | 

 ! 

What do we mean by database


design?

- A process for defining a M  of a subset of


the ³real´1 world, then representing it as data
in tables in a relational database

At least, that¶s the definition we will use for


the purposes of this talk.

1 Well, for small values of real, anyway.

7 © 2007 Progress Software Corporation

Basic Elements

What do we put in our model?


- ¢ust 3 Things:
‡ Entities
‡ Attributes
‡ Relationships

The ³entity-relationship model´ was described by Peter Chen in 1976.

See http://bit.csc.lsu.edu/~chen/chen.html

8 © 2007 Progress Software Corporation

Basic Elements: Entities

- Can be thought of as nouns


‡ People
± author, composer, performer, seller, buyer
‡ Places
± home, IP address, URL, destination, factory,
store
‡ Things
± song, recording, instrument, car, invoice

Is ³telephone number´ a place or a thing?

9 © 2007 Progress Software Corporation

 
|
 


 !"#

| |||

 | | | 

 ! 

Basic Elements: Attributes

r 

- Can be thought of as adjectives (but only loosely):


‡ Length
‡ Color
‡ Horsepower
‡ Part number
‡ Song Title
‡ Publication Date
‡ Size
‡ Fabric
‡ Owner

Is ³telephone number´ a attribute or an entity?


10 © 2007 Progress Software Corporation

Basic Elements: Relationships

r    

- Can be thought of as verbs:


‡ has a
‡ owns
‡ contains
‡ supervises
‡ performs
‡ called
‡ sold
‡ purchased
‡ proved
Is ³telephone number´ a relationship?
11 © 2007 Progress Software Corporation

Relationships have attributes too

In May, 1995,
Andrew Wiles
published
a proof
of Fermat¶s Last Theorem

12 © 2007 Progress Software Corporation

 
|
 


 !"#
$
| |||

 | | | 

 ! 

Relationships have attributes too

In May, 1995, attribute


entity Andrew Wiles
published relationship
entity a proof
of Fermat¶s Last Theorem

13 © 2007 Progress Software Corporation

What goes in an entity

- Identifying attributes
‡ Must be able to uniquely identify the entity
‡ Can have more than one way to id
‡ Id can be composite
- Descriptive attributes
‡ the values you need to keep track of
‡ generally should be simple, not complex

14 © 2007 Progress Software Corporation

What to include in your model

- The things your application has to keep track of


‡ Telephones, wires, switches
- The actions your application or its users perform
‡ Make calls, send telephone bills, collect payments
- Some attributes of the things and actions
‡ Originating number, date and time of call, duration, called
number

- Keep it simple
- Be accurate
- Keep it up to date

15 © 2007 Progress Software Corporation

 
|
 


 !"#
%
| |||

 | | | 

 ! 

What to include in your model

- Consider the goals of the system


- Everything you include should be there for a
reason you can state
‡ in no more than two sentences
- Everything should have a clear name
‡ if you can¶t name it, it doesn¶t belong
- Talk to the stakeholders !!!

16 © 2007 Progress Software Corporation

What to leave out of your model

- The real world has properties that don¶t


matter (to your application)
- The real world has relationships that don¶t
matter
- Things happen in the real world that don¶t
matter
- Keep it simple
‡ If you can¶t say why you need it, leave it out

17 © 2007 Progress Software Corporation

Logical vs Physical Data Models

- Logical entities often require multiple tables to


represent them
‡ Tables can be thought of as logical or physical
‡ It depends on your point of view
- There is also the physical storage database layout
‡ storage areas
‡ data extents
‡ disks
‡ etc.
- We aren¶t going to talk about the physical database
layout
- We will talk about tables

18 © 2007 Progress Software Corporation

 
|
 


 !"#
&
| |||

 | | | 

 ! 

Mapping Your Model to a


Database

Simply put,
- Entities become tables
‡ Identifiers become indexes
- Attributes become columns
‡ Data types: pick appropriate
- Relationships become tables or foreign keys

19 © 2007 Progress Software Corporation

³In theory, there is no difference between


theory and practice, but in practice there is.´

¢an van de Snepscheut

20 © 2007 Progress Software Corporation

Now for some practice.

21 © 2007 Progress Software Corporation

 
|
 


 !"#

| |||

 | | | 

 ! 

An example

- Music store
‡ Buys compact disc recordings from
distributors
‡ Has inventory
‡ Allows customers to search for what they want
± Maybe in an in-store kiosk or on the web
‡ Sells compact discs to customers

22 © 2007 Progress Software Corporation

What should we do first?

23 © 2007 Progress Software Corporation

Activities

- We buy discs from a distributor


- Orders are sent to a distributor
- Orders are delivered to the store
- Orders may be cancelled
- We sell discs to customers in sales transactions
- Customers buy discs in sales transactions
- Customers search for what they want to buy

Which of these must be remembered by the system?

24 © 2007 Progress Software Corporation

 
|
 


 !"#
'
| |||

 | | | 

 ! 

What do we need to keep track of

- Discs we have
- Discs we sold
- Discs we know about and can get
- Discs we have ordered
- Information needed to do our income tax
‡ what we paid for stock
‡ when we bought it
‡ what we sold it for
‡ when we sold it

25 © 2007 Progress Software Corporation

Disc entities

- UPC Code: 8697-07416-2


- Manufacturer: Sony BMG
- Cost to us: $ 2.00
- Price charged: $ 17.95
- Tax charged: $ 0.80
- Date purchased: March 19, 2007
- Date sold: ¢une 9, 2007

26 © 2007 Progress Software Corporation

Disc table might look like this

upc manuf cost price tax datePurch dateSold


8697-07416-2 Sony BMG 2.00 17.95 0.90 2007-03-19 2007-06-09
8697-07416-2 Sony BMG 2.00 ? ? 2007-06-09 ?
314-510347-2 Island Records 2.21 15.95 0.80 2006-01-12 2007-02-14
314-510347-2 Island Records 2.21 ? ? 2006-01-12

27 © 2007 Progress Software Corporation

 
|
 


 !"#
(
| |||

 | | | 

 ! 

What¶s wrong?

- Is upc a unique identifier?


- Might have bought from a distributor
- Have no information about what is on the disc
‡ How do customers search?
- Don¶t know when disc was made
- Could be more than one tax jurisdiction
‡ provincial tax, city tax
- Don¶t know if disc is on order
- Don¶t know who bought it
- Duplicated data
- Etc., etc.

28 © 2007 Progress Software Corporation

Disc entities take 2

- UPC Code: 8697-07416-2


- Manufacturer: Sony BMG
- Distributor: Bob¶s Wholesale CD¶s
- Cost to us: $ 2.00
- Price charged: $ 17.95
- Tax charged: $ 0.80
- Date ordered: March 19, 2007
- Date received: March 20, 2007
- Date sold: ¢une 9, 2007
- Disc Title: ³The Essential ¢oshua Bell´
- Artist: ¢oshua Bell
- Track 1: ³Danse Russe´
- Track 2: ³Violin Concerto in E Minor´
- Track 3: ³Nocturne in C-sharp Minor´
- etc.
29 © 2007 Progress Software Corporation

Example: Now What¶s wrong?

- This is getting messy


- Activities combined with disc¶s attributes
- Have duplicated information
- How many tracks can there be?
- What if there is more than one artist?
- Don¶t have all the information a customer
might want to use to search

30 © 2007 Progress Software Corporation

 
|
 


 !"#

| |||

 | | | 

 ! 

Discs revisited

- Discs have titles


- Discs have pictures on the cover
- Discs contain tracks
- Discs are made by manufacturers
- Discs are purchased from distributors
- Discs are ordered from distributors
- Discs are delivered to the store
- Discs are sold to customers

31 © 2007 Progress Software Corporation

³Discs contain tracks «´

- Tracks contain songs


- Tracks occur in order
- Tracks have a duration
- Songs are performed in performances
- Songs have performers (usually)
- Songs have composers
- Songs have names (titles)
- Songs have a key (but not always)
- Performances are done by performers
- Performers can be groups (bands, orchestras, etc.)
- Performances are performed in a location or venue

32 © 2007 Progress Software Corporation

We seem to need these entities

- Discs - Tracks
- Manufacturers - Songs
- Distributors - Performers
- Orders - Groups ?
- Customers
- Inventory

33 © 2007 Progress Software Corporation

 
|
 


 !"#

| |||

 | | | 

 ! 

Songs have names (titles).

Are names properties of songs?

Or are they entities related to songs?

Or are they something else?

34 © 2007 Progress Software Corporation

Song data (track 1)

Title ³Danse Russe´ from Swan Lake, Op.20


Time 4:30
Composer Peter Tchaikovsky
Category Classical, violin, orchestra
Performers ¢oshua Bell, Michael Tilson Thomas,
Berlin Philharminic Orchestra
Track number 1
Disc upc 8697-07416-2

35 © 2007 Progress Software Corporation

Song data (track 2)

Title Violin Concerto in E Minor, Op. 64


Time 6:27
Composer Felix Mendelssohn
Category Classical, violin, orchestra
Performers ¢oshua Bell, Sir Roger Norrington,
Camerata Salzburg
Track number 2
Disc upc 8697-07416-2

36 © 2007 Progress Software Corporation

 
|
 


 !"#

| |||

 | | | 

 ! 

Performance data

Title Violin Concerto in E Minor, Op. 64


Time 6:27
Composer Felix Mendelssohn
Category Classical, violin, orchestra
Performers ¢oshua Bell, Sir Roger Norrington,
Camerata Salzburg

37 © 2007 Progress Software Corporation

Performance data take 2

Title Violin Concerto in E Minor, Op. 64


Time 6:27
Composer Felix Mendelssohn
Category Classical, violin, orchestra
Performers ¢oshua Bell, Sir Roger Norrington,
Camerata Salzburg
Performance ?
Date
Performance ?
Location

38 © 2007 Progress Software Corporation

Performer data

id name
1 ¢oshua Bell
2 Sir Roger Norrington
3 Camerata Salzburg
4 Michael Tilson Thomas
5 Berlin Philharmonic
6 Bono
7 The Edge
8 Adam Clayton
9 Larry Mullen

39 © 2007 Progress Software Corporation

 
|
 


 !"#

| |||

 | | | 

 ! 

Performance to Performer
Relationship

performance id performer id
1 1
1 2
1 3
1 «
2 1
2 4
2 5
2 «
325 6
325 7
325 8
325 9
40 © 2007 Progress Software Corporation

Performance data take 3

Performance id 2
Title Violin Concerto in E Minor, Op. 64
Time 6:27
Composer Felix Mendelssohn
Category Classical, violin, orchestra

41 © 2007 Progress Software Corporation

Track to Performance
Relationship

Disc upc Track Num Performance id


8697-07416-2 1 1
8697-07416-2 2 2
« « «
314-510347-2 1 325

42 © 2007 Progress Software Corporation

 
|
 


 !"#
$
| |||

 | | | 

 ! 

Relationships (so far):

track
performance

one to one performer


performance

performance
disc performer
track
performance
track
one to many
many to many
track

43 © 2007 Progress Software Corporation

What happened to Songs?

44 © 2007 Progress Software Corporation

Relationships (take 2):

song
performance
track
song
one to many
performance
one to one

performance

disc performer
track performance

track performance
one to many
performer
track performance
many to many

45 © 2007 Progress Software Corporation

 
|
 


 !"#
%
| |||

 | | | 

 ! 

Relationships (take 3):

disc

track song
performance

performer
track song

performance
performer
track song

performance

46 © 2007 Progress Software Corporation

What about
³business entities´
?

Where are they


?

47 © 2007 Progress Software Corporation

Business entities

disc

track song
performance

performer
track song

performance
performer
track song

performance

48 © 2007 Progress Software Corporation

 
|
 


 !"#
&
| |||

 | | | 

 ! 

Business entities

disc

track song
performance

performer
track song

performance
performer
track song

performance

49 © 2007 Progress Software Corporation

Business entities

disc

track song
performance

performer
track song

performance
performer
track song

performance

50 © 2007 Progress Software Corporation

Should you use arrays?

51 © 2007 Progress Software Corporation

 
|
 


 !"#

| |||

 | | | 

 ! 

Indexes

- Enforce uniqueness
- Make searches faster
- Enable fast retrieval of entities by their
identities
- Enable finding entities with certain attributes

52 © 2007 Progress Software Corporation

What indexes do we need


for the music store database?

53 © 2007 Progress Software Corporation

Tables

0) Discs
1) Tracks
2) Songs
3) Performers
4) Performances
5) Tracks of discs
6) Performances of songs
7) Performers of performances

54 © 2007 Progress Software Corporation

 
|
 


 !"#
'
| |||

 | | | 

 ! 

What indexes do we need

0) Indexes for identifying attributes


1) A unique row identifier
2) Indexes for the queries you will do

55 © 2007 Progress Software Corporation

What should we do next ?

56 © 2007 Progress Software Corporation

Other Topics

- Normalization
- Unique keys
- Word indexes
- Naming
- Customisation

57 © 2007 Progress Software Corporation

 
|
 


 !"#
(
| |||

 | | | 

 ! 

Normalization

- Oversimplified, it means:
‡ Don¶t duplicate data
- Attributes should be simple
‡ have only one value
‡ be necessary
‡ not derived data
‡ don¶t repeat
- Complicated attributes are often entities in
their own right
‡ For example, addresses might be

58 © 2007 Progress Software Corporation

Unique keys

- EVERY table must have a unique key


- EVERY row needs a unique identifier
‡ that never changes even if moved to another database
(i.e. if you replicate)
- Often, users don¶t need to see it
- Use a UUID or sequence or maybe datetime
- Unique key is the ONLY way to identify rows
unambiguously
- ROWID¶s are temporary and can change
- Use the same method throughout
‡ You¶ll be glad you did

59 © 2007 Progress Software Corporation

Word indexes

- Can be used to hold multiple status or


attribute values
‡ Conflicts with normalisation
‡ Flexible

- Easy to add new ones


- Queries are fast

- Example:
‡ Category: classical, violin, orchestral, concerto

60 © 2007 Progress Software Corporation

 
|
 


 !"#

| |||

 | | | 

 ! 

Naming

Good names are crucial to understanding

‡ What is in the column ³GL01262´ ?

61 © 2007 Progress Software Corporation

Naming

Good names are crucial to understanding

- Table and column names should have clear


meanings everyone can understand
‡ ³GL01262´ vs ³dateEntered´
- Names with dashes cause inconvenience
with SQL
‡ ³order-date´
- Booleans should be named for truth value
‡ ³backOrdered´
- No double negations
‡ ³notOutOfStock´

62 © 2007 Progress Software Corporation

Making tables customizable

We will look at 3 ways:


- Spare columns
- Separate table with spare columns
- Separate table with name/value pairs

63 © 2007 Progress Software Corporation

 
|
 


 !"#

| |||

 | | | 

 ! 

Spare columns in table

         

001 Bob Phoenix frozen ? 0.0

002 Alice Boston ? 125.46 0.12

003 Eve Denver ? ? ?

64 © 2007 Progress Software Corporation

Spare columns in table

         

001 Bob Phoenix frozen ? 0.0

002 Alice Boston ? 125.46 0.12

003 Eve Denver ? ? ?

What data types should you use?


How many spare columns?
Wasted columns when not used
How do you know what each spare got used for?
How do you know how many unused spares you have?

65 © 2007 Progress Software Corporation

Separate table for spare columns

   

001 Bob Phoenix

002 Alice Boston

003 Eve Denver

      

001 frozen ? 0.0

002 ? 125.46 0.12

66 © 2007 Progress Software Corporation

 
|
 


 !"#

| |||

 | | | 

 ! 

Separate table for spare columns

   

001 Bob Phoenix

002 Alice Boston

003 Eve Denver

     

001 frozen ? 0.0

002 ? 125.46 0.12

67 © 2007 Progress Software Corporation

Separate table with name/value


pairs

   

001 Bob Phoenix


    
002 Alice Boston
001 status frozen
003 Eve Denver
002 owed 125.46

002 discount 0.12

68 © 2007 Progress Software Corporation

Modeling Tools

- PCase
- Enterprise Architect
- Power Designer
- ConceptDraw
- Erwin
- Rational

Pencil and paper !


Blackboard !
69 © 2007 Progress Software Corporation

 
|
 


 !"#

| |||

 | | | 

 ! 

Summary

- Understand the requirements


- Leave out what is not needed
- Review the design with stakeholders
- Evolve the design as changes come up
- Test to make sure it works
‡ Can it do everything that is needed?
‡ Does it perform adequately?
- Expect changes to come

70 © 2007 Progress Software Corporation

Homework

- Papers
‡ Wiles, A.: "Modular elliptic curves and Fermat's Last
Theorem´, þ 
 M  141 (3): 443-551
‡ Chen, P.: ³The Entity-Relationship Model -- Toward a
Unified View of Data´, þ   Vol 1, No 1, 1976
- Wikipedia articles to start from:
‡ entity-relationship model
‡ data model
- Books:
‡ Teorey, Lightstone, Nadeau: ·   
 , Morgan Kaufmann.

71 © 2007 Progress Software Corporation

ë 

72 © 2007 Progress Software Corporation

 
|
 


 !"#
$
| |||

 | | | 

 ! 

73 © 2007 Progress Software Corporation

 
|
 


 !"#
%

Você também pode gostar