Você está na página 1de 82

DATABASE SOLUTIONS (2nd Edition) THOMAS M CONNOLLY & CAROLYN E BEGG

SOLUTIONS TO REVIEW QUESTIONS

Database Solutions (2nd Edition)

Chapter 1 Introduction- Review questions

1.1

List four examples of database systems other than those listed in Section 1.1.

Some examples could be: 1. A system that maintains component part details for a car manufacturer; An advertising company eeping details of all clients and adverts placed !ith them; A training company eeping course information and participants" details; An organi#ation maintaining all sales order information$ !iscuss the meanin" of each of the followin" terms#

$a%

data

%or end users& this constitutes all the different values connected !ith the various ob'ects(entities that are of concern to them$ $b% database

A shared collection of logically related data (and a description of this data)& designed to meet the information needs of an organi#ation$ $c% database mana"ement system

A soft!are system that: enables users to define& create& and maintain the database and provides controlled access to this database$ $d% application pro"ram

A computer program that interacts !ith the database by issuing an appropriate re)uest (typically an S*+ statement) to the D,-S$ $e% data independence

.his is essentially the separation of underlying file structures from the programs that operate on them& also called program/data independence$

Database Solutions (2nd Edition)


$f% views.

A virtual table that does not necessarily exist in the database but is generated by the D,-S from the underlying base tables !henever it"s accessed$ .hese present only a subset of the database that is of particular interest to a user$ 0ie!s can be customi#ed& for example& field names may change& and they also provide a level of security preventing users from seeing certain data$ 1.& !escribe the main characteristics of the database approach.

%ocus is no! on the data first& and then the applications$ .he structure of the data is no! ept separate from the programs that operate on the data$ .his is held in the system catalog or data dictionary$ 1rograms can no! share data& !hich is no longer fragmented$ .here is also a reduction in redundancy& and achievement of program/data independence$

1.'

!escribe the five components of the !()S environment and discuss how they relate to each other.

(2) 3ard!are:

.he computer system(s) that the D,-S and the application programs run on$

.his can range from a single 14& to a single mainframe& to a net!or of computers$ (2) Soft!are: .he D,-S soft!are and the application programs& together !ith the operating

system& including net!or soft!are if the D,-S is being used over a net!or $ (5) Data: .he data acts as a bridge bet!een the hard!are and soft!are components and

the human components$ As !e"ve already said& the database contains both the operational data and the meta/data (the 6data about data")$ (7) 1rocedures: .he instructions and rules that govern the design and use of the database$ .his may include instructions on ho! to log on to the D,-S& ma e bac up copies of the database& and ho! to handle hard!are or soft!are failures$ (8) 1eople: .his includes the database designers& database administrators (D,As)&

application programmers& and the end/users$ 1.* !escribe the problems with the traditional two-tier client-server architecture and discuss how these problems were overcome with the three-tier client-server architecture.

Database Solutions (2nd Edition)


9n the mid/2::;s& as applications became more complex and potentially could be deployed to hundreds or thousands of end/users& the client side of this architecture gave rise to t!o problems: A 6fat" client& re)uiring considerable resources on the client"s computer to run effectively (resources include dis space& <A-& and 41= po!er)$ A significant client side administration overhead$

,y 2::8& a ne! variation of the traditional t!o/tier client/server model appeared to solve these problems called the three-tier client-server architecture$ .his ne! architecture proposed three layers& each potentially running on a different platform: (2) .he user interface layer& !hich runs on the end/user"s computer (the client)$ (2) .he business logic and data processing layer$ .his middle tier runs on a server and is often called the application server$ >ne application server is designed to serve multiple clients$ (5) A D,-S& !hich stores the data re)uired by the middle tier$ .his tier may run on a separate server called the database server$ .he three/tier design has many advantages over the traditional t!o/tier design& such as: A 6thin" client& !hich re)uires less expensive hard!are$ Simplified application maintenance& as a result of centrali#ing the business logic for many end/users into a single application server$ .his eliminates the concerns of soft!are distribution that are problematic in the traditional t!o/tier client/server architecture$ Added modularity& !hich ma es it easier to modify or replace one tier !ithout affecting the other tiers$ Easier load balancing& again as a result of separating the core business logic from the database functions$ %or example& a +ransaction ,rocessin" )onitor $+,)% can be used to reduce the number of connections to the database server$ (A .1- is a program that controls data transfer bet!een clients and servers in order to provide a consistent environment for >nline .ransaction 1rocessing (>+.1)$) An additional advantage is that the three/tier architecture maps )uite naturally to the ?eb environment& !ith a ?eb bro!ser acting as the 6thin" client& and a ?eb server acting as the application server$ .he three/tier client server architecture is illustrated in %igure 2$7$

Database Solutions (2nd Edition)

1.-

!escribe the functions that should be provided by a modern full-scale multi-user !()S.

Data Storage& <etrieval and =pdate A =ser/Accessible 4atalog .ransaction Support 4oncurrency 4ontrol Services <ecovery Services 1..

Authori#ation Services Support for Data 4ommunication 9ntegrity Services Services to 1romote Data 9ndependence =tility Services

/f the functions described in your answer to 0uestion 1.-1 which ones do you thin2 would not be needed in a standalone ,C !()S3 ,rovide 4ustification for your answer.

4oncurrency 4ontrol Services / only single user$ Authori#ation Services / only single user& but may be needed if different individuals are to use the D,-S at different times$ =tility Services / limited in scope$ Support for Data 4ommunication / only standalone system$ 1.5 !iscuss the advanta"es and disadvanta"es of !()Ss.

Some advantages of the database approach include control of data redundancy& data consistency& sharing of data& and improved security and integrity$ Some disadvantages include complexity& cost& reduced performance& and higher impact of a failure$

Database Solutions (2nd Edition)

Chapter
.1

+he Relational )odel - Review questions

!iscuss each of the followin" concepts in the context of the relational data model# $a% relation

A table !ith columns and ro!s$ $b% attribute

A named column of a relation$ $c% domain

.he set of allo!able values for one or more attributes$ $d% tuple

A record of a relation$ $e% relational database.

A collection of normali#ed tables$ . !iscuss the properties of a relational table.

A relational table has the follo!ing properties: .he table has a name that is distinct from all other tables in the database$ Each cell of the table contains exactly one value$ (%or example& it !ould be !rong to store several telephone numbers for a single branch in a single cell$ 9n other !ords& tables don"t contain repeating groups of data$ A relational table that satisfies this property is said to be

normali#ed or in first normal form$)


Each column has a distinct name$ .he values of a column are all from the same domain$ .he order of columns has no significance$ 9n other !ords& provided a column name is moved along !ith the column values& !e can interchange columns$ Each record is distinct; there are no duplicate records$

Database Solutions (2nd Edition)


.& .he order of records has no significance& theoretically$ !iscuss the differences between the candidate 2eys and the primary 2ey of a table. 6xplain what is meant by a forei"n 2ey. 7ow do forei"n 2eys of tables relate to candidate 2eys3 8ive examples to illustrate your answer. .he primary ey is the candidate ey that is selected to identify tuples uni)uely !ithin a

relation$ A foreign ey is an attribute or set of attributes !ithin one relation that matches the candidate ey of some (possibly the same) relation$ .' 9hat does a null represent3

<epresents a value for a column that is currently un no!n or is not applicable for this record$ .* !efine the two principal inte"rity rules for the relational model. !iscuss why it is desirable to enforce these rules. 6ntity inte"rity 9n a base table& no column of a primary ey can be null$ ey exists in a table& either the foreign ey value must

Referential inte"rity 9f a foreign

match a candidate ey value of some record in its home table or the foreign ey value must be !holly null$

Database Solutions (2nd Edition)

Chapter & S0L and 0(6 - Review questions


&.1 9hat are the two ma4or components of S0L and what function do they serve3

A data definition language (DD+) for defining the database structure$ A data manipulation language (D-+) for retrieving and updating data$ &. 6xplain the function of each of the clauses in the SELECT statement. 9hat restrictions are imposed on these clauses3 FROM WHERE GROU BY HAVING SELECT ORDER BY &.& specifies the table or tables to be used; filters the ro!s sub'ect to some condition; forms groups of ro!s !ith the same column value; filters the groups sub'ect to some condition; specifies !hich columns are to appear in the output; specifies the order of the output$

9hat restrictions apply to the use of the a""re"ate functions within the SELECT statement3 7ow do nulls affect the a""re"ate functions3

An aggregate function can be used only in the SE+E4. list and in the 3A09BC clause$

Apart from 4>=B.(D)& each function eliminates nulls first and operates only on the remaining non/null values$ 4>=B.(D) counts all the ro!s of a table& regardless of !hether nulls or duplicate values occur$ &.' 6xplain how the GROUP BY clause wor2s. 9hat is the difference between the WHERE and HAVING clauses3 S*+ first applies the ?3E<E clause$ .hen it conceptually arranges the table based on the grouping column(s)$ Bext& applies the 3A09BC clause and finally orders the result according to the ><DE< ,E clause$

?3E<E filters ro!s sub'ect to some condition; 3A09BC filters groups sub'ect to some condition$

Database Solutions (2nd Edition)


&.* 9hat is the difference between a subquery and a 4oin3 :nder what circumstances would you not be able to use a subquery3 ?ith a sub)uery& the columns specified in the SE+E4. list are restricted to one table$ .hus& cannot use a sub)uery if the SE+E4. list contains columns from more than one table$ &.9hat is 0(6 and what is the relationship between 0(6 and S0L3

*,E is an alternative& graphical/based& 6point/and/clic " !ay of )uerying the database& !hich is particularly suited for )ueries that are not too complex& and can be expressed in terms of a fe! tables$ *,E has ac)uired the reputation of being one of the easiest !ays for non/technical users to obtain information from the database$

*,E )ueries are converted into their e)uivalent S*+ statements before transmission to the D,-S server$

Database Solutions (2nd Edition)

Chapter ' !atabase Systems !evelopment Lifecycle - Review questions

'.1

!escribe what is meant by the term ;software crisis<.

.he past fe! decades has !itnessed the dramatic rise in the number of soft!are applications$ -any of these applications proved to be demanding& re)uiring constant maintenance$ .his maintenance involved correcting faults& implementing ne! user re)uirements& and modifying the soft!are to run on ne! or upgraded platforms$ ?ith so much soft!are around to support& the effort spent on maintenance began to absorb resources at an alarming rate$ As a result& many ma'or soft!are pro'ects !ere late& over budget& and the soft!are produced !as unreliable& difficult to maintain& and performed poorly$ .his led to !hat has become no!n as the 6soft!are crisis"$ Although this term !as first used in the late 2:@;s& more than 5; years later& the crisis is still !ith us$ As a result& some people no! refer to the soft!are crisis as the 6soft!are depression"$ '. !iscuss the relationship between the information systems lifecycle and the database system development lifecycle. An information system is the resources that enable the collection& management& control& and dissemination of data(information throughout a company$ .he database is a fundamental component of an information system$ .he lifecycle of an information system is inherently lin ed to the lifecycle of the database that supports it$

.ypically& the stages of the information systems lifecycle include: planning& re)uirements collection and analysis& design (including database design)& prototyping& implementation& testing& conversion& and operational maintenance$ As a database is a fundamental component of the larger company/!ide information system& the database system development lifecycle is inherently lin ed !ith the information systems lifecycle$ '.& (riefly describe the sta"es of the database system development lifecycle.

See =i"ure '.1 Stages of the database system development lifecycle$

2;

Database Solutions (2nd Edition)


!atabase plannin" is the management activities that allo! the stages of the database system development lifecycle to be reali#ed as efficiently and effectively as possible$

System definition involves identifying the scope and boundaries of the database system including its ma'or user vie!s$ A user vie! can represent a 'ob role or business application area$

Requirements collection and analysis is the process of collecting and analy#ing information about the company that is to be supported by the database system& and using this information to identify the re)uirements for the ne! system$

.here are three approaches to dealing !ith multiple user vie!s& namely the centrali#ed approach& the vie! integration approach& and a combination of both$ .he centrali>ed approach involves collating the users" re)uirements for different user vie!s into a single list of re)uirements$ A data model representing all the user vie!s is created during the database design stage$ .he view inte"ration approach involves leaving the users" re)uirements for each user vie! as separate lists of re)uirements$ Data models representing each user vie! are created and then merged at a later stage of database design$

!atabase desi"n is the process of creating a design that !ill support the company"s mission statement and mission ob'ectives for the re)uired database$ .his stage includes the logical and physical design of the database$

.he aim of !()S selection is to select a system that meets the current and future re)uirements of the company& balanced against costs that include the purchase of the D,-S product and any additional soft!are(hard!are& and the costs associated !ith changeover and training$

?pplication desi"n involves designing the user interface and the application programs that use and process the database$ .his stage involves t!o main activities: transaction design and user interface design$

,rototypin" involves building a !or ing model of the database system& !hich allo!s the designers or users to visuali#e and evaluate the system$

22

Database Solutions (2nd Edition)

Implementation is the physical reali#ation of the database and application designs$

!ata conversion and loadin" involves transferring any existing data into the ne! database and converting any existing applications to run on the ne! database$

+estin" is the process of running the database system !ith the intent of finding errors$

/perational maintenance is the process of monitoring and maintaining the system follo!ing installation$ '.' !escribe the purpose of creatin" a mission statement and mission ob4ectives for the required database durin" the database plannin" sta"e. .he mission statement defines the ma'or aims of the database system& !hile each mission ob'ective identifies a particular tas that the database must support$ '.* !iscuss what a user view represents when desi"nin" a database system.

A user vie! defines !hat is re)uired of a database system from the perspective of a particular 'ob (such as -anager or Supervisor) or business application area (such as mar eting& personnel& or stoc control)$ '.Compare and contrast the centrali>ed approach and view inte"ration approach to mana"in" the desi"n of a database system with multiple user views. An important activity of the re)uirements collection and analysis stage is deciding ho! to deal !ith the situation !here there is more than one user vie!$ .here are three approaches to dealing !ith multiple user vie!s: the centrali#ed approach& the vie! integration approach& and a combination of both approaches$

22

Database Solutions (2nd Edition)


Centrali>ed approach <e)uirements for each user vie! are merged into a single list of re)uirements for the ne! database system$ A logical data model representing all user vie!s is created during the database design stage$ .he centrali>ed approach involves collating the re)uirements for different user vie!s into a single list of re)uirements$ A data model representing all user vie!s is created in the database design stage$ A diagram representing the management of user vie!s 2 to 5 using the centrali#ed approach is sho!n in %igure 7$7$ Cenerally& this approach is preferred !hen there is a significant overlap in re)uirements for each user vie! and the database system is not overly complex$

See =i"ure '.' .he centrali#ed approach to managing multiple user vie!s 2 to 5$

@iew inte"ration approach <e)uirements for each user vie! remain as separate lists$ Data models representing each user vie! are created and then merged later during the database design stage$ .he view inte"ration approach involves leaving the re)uirements for each user vie! as separate lists of re)uirements$ ?e create data models representing each user vie!$ A data model that represents a single user vie! is called a local lo"ical data model$ ?e then merge the local data models to create a "lobal lo"ical data model representing all user vie!s of the company$ A diagram representing the management of user vie!s 2 to 5 using the vie! integration approach is sho!n in %igure 7$8$ Cenerally& this approach is preferred !hen there are significant differences bet!een user vie!s and the database system is sufficiently complex to 'ustify dividing the !or into more manageable parts$ See =i"ure '.* .he vie! integration approach to managing multiple user vie!s 2 to 5$

%or some complex database systems it may be appropriate to use a combination of both the centrali#ed and vie! integration approaches to managing multiple user vie!s$ %or example& the re)uirements for t!o or more users vie!s may be first merged using the centrali#ed approach and then used to create a local lo"ical data model $ (.herefore in this situation the local data model represents not 'ust a single user vie! but the number of user vie!s merged using the centrali#ed approach)$ .he local data models representing one or more user vie!s are then

25

Database Solutions (2nd Edition)


merged using the vie! integration approach to form the "lobal lo"ical data model representing all user vie!s$

'..

6xplain why it is necessary to select the tar"et !()S before be"innin" the physical database desi"n phase.

Database design is made up of t!o main phases called logical and physical design$ During logical database design& !e identify the important ob'ects that need to be represented in the database and the relationships bet!een these ob'ects$ During physical database design& !e decide ho! the logical design is to be physically implemented (as tables) in the target D,-S$ .herefore it is necessary to have selected the target D,-S before !e are able to proceed to physical database design$

See %igure 7$2 Stages of the database system development lifecycle$ '.5 !iscuss the two main activities associated with application desi"n.

.he database and application design stages are parallel activities of the database system development lifecycle$ 9n most cases& !e cannot complete the application design until the design of the database itself has ta en place$ >n the other hand& the database exists to support the applications& and so there must be a flo! of information bet!een application design and database design$

.he t!o main activities associated !ith the application design stage is the design of the user interface and the application programs that use and process the database$

?e must ensure that all the functionality stated in the re)uirements specifications is present in the application design for the database system$ .his involves designing the interaction bet!een the user and the data& !hich !e call transaction design$ 9n addition to designing ho! the re)uired functionality is to be achieved& !e have to design an appropriate user interface to the database system$ '.A !escribe the potential benefits of developin" a prototype database system.

.he purpose of developing a prototype database system is to allo! users to use the prototype to identify the features of the system that !or !ell& or are inade)uate& and if possible to suggest

27

Database Solutions (2nd Edition)


improvements or even ne! features for the database system$ 9n this !ay& !e can greatly clarify the re)uirements and evaluate the feasibility of a particular system design$ 1rototypes should have the ma'or advantage of being relatively inexpensive and )uic to build$ '.1B !iscuss the main activities associated with the implementation sta"e.

.he database implementation is achieved using the Data Definition +anguage (DD+) of the selected D,-S or a graphical user interface (C=9)& !hich provides the same functionality !hile hiding the lo!/level DD+ statements$ .he DD+ statements are used to create the database structures and empty database files$ Any specified user vie!s are also implemented at this stage$

.he application programs are implemented using the preferred third or fourth "eneration lan"ua"e $&8L or '8L%$ 1arts of these application programs are the database transactions& !hich !e implement using the Data -anipulation +anguage (D-+) of the target D,-S& possibly embedded !ithin a host programming language& such as 0isual ,asic (0,)& 0,$net& 1ython& Delphi& 4& 4GG& 4H& Iava& 4>,>+& %ortran& Ada& or 1ascal$ ?e also implement the other components of the application design such as menu screens& data entry forms& and reports$ Again& the target D,-S may have its o!n fourth generation tools that allo! rapid development of applications through the provision of non/procedural )uery languages& reports generators& forms generators& and application generators$

Security and integrity controls for the application are also implemented$ Some of these controls are implemented using the DD+& but others may need to be defined outside the DD+ using& for example& the supplied D,-S utilities or operating system controls$ '.11 !escribe the purpose of the data conversion and loadin" sta"e.

.his stage is re)uired only !hen a ne! database system is replacing an old system$ Bo!adays& it"s common for a D,-S to have a utility that loads existing files into the ne! database$ .he utility usually re)uires the specification of the source file and the target database& and then automatically converts the data to the re)uired format of the ne! database files$ ?here applicable& it may be possible for the developer to convert and use application programs from the old system for use by the ne! system$

28

Database Solutions (2nd Edition)


'.1 6xplain the purpose of testin" the database system.

,efore going live& the ne!ly developed database system should be thoroughly tested$ .his is achieved using carefully planned test strategies and realistic data so that the entire testing process is methodically and rigorously carried out$ Bote that in our definition of testing !e have not used the commonly held vie! that testing is the process of demonstrating that faults are not present$ 9n fact& testing cannot sho! the absence of faults; it can sho! only that soft!are faults are present$ 9f testing is conducted successfully& it !ill uncover errors in the application programs and possibly the database structure$ As a secondary benefit& testing demonstrates that the database and the application programs appear to be !or ing according to their specification and that performance re)uirements appear to be satisfied$ 9n addition& metrics collected from the testing stage provides a measure of soft!are reliability and soft!are )uality$ As !ith database design& the users of the ne! system should be involved in the testing process$ .he ideal situation for system testing is to have a test database on a separate hard!are system& but often this is not available$ 9f real data is to be used& it is essential to have bac ups ta en in case of error$ .esting should also cover usability of the database system$ 9deally& an evaluation should be conducted against a usability specification$ Examples of criteria that can be used to conduct the evaluation include (Sommerville& 2;;;): +earnability / 3o! long does it ta e a ne! user to become productive !ith the systemJ 1erformance / 3o! !ell does the system response match the user"s !or practiceJ <obustness / 3o! tolerant is the system of user errorJ <ecoverability / 3o! good is the system at recovering from user errorsJ Adapatability / 3o! closely is the system tied to a single model of !or J Some of these criteria may be evaluated in other stages of the lifecycle$ After testing is complete& the database system is ready to be 6signed off" and handed over to the users$ '.1& 9hat are the main activities associated with operational maintenance sta"e.

9n this stage& the database system no! moves into a maintenance stage& !hich involves the follo!ing activities: -onitoring the performance of the database system$ 9f the performance falls belo! an acceptable level& the database may need to be tuned or reorgani#ed$

2@

Database Solutions (2nd Edition)


-aintaining and upgrading the database system (!hen re)uired)$ Be! re)uirements are incorporated into the database system through the preceding stages of the lifecycle$

2A

Database Solutions (2nd Edition)

Chapter * !atabase ?dministration and Security - Review questions


*.1 !efine the purpose and tas2s associated with data administration and database administration. Data administration is the management and control of the corporate data& including database planning& development and maintenance of standards& policies and procedures& and logical database design$

2F

Database Solutions (2nd Edition)


Database administration is the management and control of the physical reali#ation of the

corporate database system& including physical database design and implementation& setting security and integrity controls& monitoring system performance& and reorgani#ing the database as necessary$

*.

Compare and contrast the main tas2s carried out by the !? and !(?.

.he Data Administrator (DA) and Database Administrator (D,A) are responsible for managing and controlling the activities associated !ith the corporate data and the corporate database& respectively$ .he DA is more concerned !ith the early stages of the lifecycle& from planning through to logical database design$ 9n contrast& the D,A is more concerned !ith the later stages& from application(physical database design to operational maintenance$ Depending on the si#e and complexity of the organi#ation and(or database system the DA and D,A can be the responsibility of one or more people$

2:

Database Solutions (2nd Edition)

*.&

6xplain the purpose and scope of database security.

Security considerations do not only apply to the data held in a database$ ,reaches of security may affect other parts of the system& !hich may in turn affect the database$ 4onse)uently& database security encompasses hard!are& soft!are& people& and data$ .o effectively implement security re)uires appropriate controls& !hich are defined in specific mission ob'ectives for the system$ .his need for security& !hile often having been neglected or overloo ed in the past& is no! increasingly recogni#ed by organi#ations$ .he reason for this turn/around is due to the increasing amounts of crucial corporate data being stored on computer and the acceptance that any loss or unavailability of this data could be potentially disastrous$ *.' List the main types of threat that could affect a database system1 and for each1 describe the possible outcomes for an or"ani>ation.

2;

Database Solutions (2nd Edition)

=i"ure *.1 A summary of the potential threats to computer systems$ *.* 6xplain the followin" in terms of providin" security for a database#

authori>ationC viewsC bac2up and recoveryC inte"rityC encryptionC R?I!.

?uthori>ation Authori#ation is the granting of a right or privilege that enables a sub'ect to have legitimate access to a system or a system"s ob'ect$ Authori#ation controls can be built into the soft!are& and govern not only !hat database system or ob'ect a specified user can access& but also !hat the user may do !ith it$ .he process of authori#ation involves authentication of a sub'ect

22

Database Solutions (2nd Edition)


re)uesting access to an ob'ect& !here 6sub'ect" represents a user or program and 6ob'ect" represents a database table& vie!& procedure& trigger& or any other ob'ect that can be created !ithin the database system$

@iews A vie! is a virtual table that does not necessarily exist in the database but can be produced upon re)uest by a particular user& at the time of re)uest$ .he vie! mechanism provides a po!erful and flexible security mechanism by hiding parts of the database from certain users$ .he user is not a!are of the existence of any columns or ro!s that are missing from the vie!$ A vie! can be defined over several tables !ith a user being granted the appropriate privilege to use it& but not to use the base tables$ 9n this !ay& using a vie! is more restrictive than simply having certain privileges granted to a user on the base table(s)$

(ac2up and recovery ,ac up is the process of periodically ta ing a copy of the database and log file (and possibly programs) onto offline storage media$ A D,-S should provide bac up facilities to assist !ith the recovery of a database follo!ing failure$ .o eep trac of database transactions& the D,-S maintains a special file called a log file (or 'ournal) that contains information about all updates to the database$ 9t is al!ays advisable to ma e bac up copies of the database and log file at regular intervals and to ensure that the copies are in a secure location$ 9n the event of a failure that renders the database unusable& the bac up copy and the details captured in the log file are used to restore the database to the latest possible consistent state$ Iournaling is the process of eeping and maintaining a log file (or 'ournal) of all changes made to the database to enable recovery to be underta en effectively in the event of a failure$

Inte"rity constraints 4ontribute to maintaining a secure database system by preventing data from becoming invalid& and hence giving misleading or incorrect results$

6ncryption 9s the encoding of the data by a special algorithm that renders the data unreadable by any program !ithout the decryption ey$ 9f a database system holds particularly sensitive data& it may be deemed necessary to encode it as a precaution against possible external threats or

22

Database Solutions (2nd Edition)


attempts to access it$ Some D,-Ss provide an encryption facility for this purpose$ .he D,-S can access the data (after decoding it)& although there is degradation in performance because of the time ta en to decode it$ Encryption also protects data transmitted over communication lines$ .here are a number of techni)ues for encoding data to conceal the information; some are termed irreversible and others reversible$ 9rreversible techni)ues& as the name implies& do not permit the original data to be no!n$ 3o!ever& the data can be used to obtain valid statistical information$ <eversible techni)ues are more commonly used$ .o transmit data securely over insecure net!or s re)uires the use of a cryptosystem& !hich includes:

an encryption ey to encrypt the data (plaintext); an encryption algorithm that& !ith the encryption ciphertext; a decryption ey to decrypt the ciphertext; a decryption algorithm that& !ith the decryption ey& transforms the ciphertext bac into plain text$ ey& transforms the plain text into

Redundant ?rray of Independent !is2s $R?I!% <A9D !or s by having a large dis array comprising an arrangement of several independent dis s that are organi#ed to improve reliability and at the same time increase performance$ .he hard!are that the D,-S is running on must be fault/tolerant& meaning that the D,-S should continue to operate even if one of the hard!are components fails$ .his suggests having redundant components that can be seamlessly integrated into the !or ing system !henever there is one or more component failures$ .he main hard!are components that should be fault/ tolerant include dis drives& dis controllers& 41=& po!er supplies& and cooling fans$ Dis drives are the most vulnerable components !ith the shortest times bet!een failures of any of the hard!are components$ >ne solution is the use of <edundant Array of 9ndependent Dis s (<A9D) technology$ <A9D !or s by having a large dis array comprising an arrangement of several independent dis s that are organi#ed to improve reliability and at the same time increase performance$

25

Database Solutions (2nd Edition)

Chapter - =act-=indin" - Review questions


-.1 (riefly describe what the process of fact-findin" attempts to achieve for a database developer. %act/finding is the formal process of using techni)ues such as intervie!s and )uestionnaires to collect facts about systems& re)uirements& and preferences$

.he database developer uses fact/finding techni)ues at various stages throughout the database systems lifecycle to capture the necessary facts to build the re)uired database system$ .he necessary facts cover the business and the users of the database system& including the terminology& problems& opportunities& constraints& re)uirements& and priorities$ .hese facts are captured using fact/finding techni)ues$ -. !escribe how fact-findin" is used throu"hout the sta"es of the database system development lifecycle. .here are many occasions for fact/finding during the database system development lifecycle$ 3o!ever& fact/finding is particularly crucial to the early stages of the lifecycle& including the database planning& system definition& and re)uirements collection and analysis stages$ 9t"s during these early stages that the database developer learns about the terminology& problems& opportunities& constraints& re)uirements& and priorities of the business and the users of the system$ %act/finding is also used during database design and the later stages of the lifecycle& but to a lesser extent$ %or example& during physical database design& fact/finding becomes technical as the developer attempts to learn more about the D,-S selected for the database system$ Also& during the final stage& operational maintenance& fact/finding is used to determine !hether a system re)uires tuning to improve performance or further developed to include ne! re)uirements$

27

Database Solutions (2nd Edition)

-.& =or each sta"e of the database system development lifecycle identify examples of the facts captured and the documentation produced.

28

Database Solutions (2nd Edition)

-.' ? database developer normally uses several fact-findin" techniques durin" a sin"le database pro4ect. +he five most commonly used techniques are examinin"

documentation1 interviewin"1 observin" the business in operation1 conductin" research1 and usin" questionnaires. !escribe each fact-findin" technique and identify the advanta"es and disadvanta"es of each. 6xaminin" documentation can be useful !hen you"re trying to gain some insight as to ho! the need for a database arose$ Eou may also find that documentation can be helpful to provide information on the business (or part of the business) associated !ith the problem$ 9f the problem relates to the current system there should be documentation associated !ith that system$ Examining documents& forms& reports& and files associated !ith the current system& is a good !ay to )uic ly gain some understanding of the system$

Interviewin" is the most commonly used& and normally most useful& fact/finding techni)ue$ Eou can intervie! to collect information from individuals face/to/face$ .here can be several ob'ectives to using intervie!ing such as finding out facts& chec ing facts& generating user interest and feelings of involvement& identifying re)uirements& and gathering ideas and opinions$

/bservation is one of the most effective fact/finding techni)ues you can use to understand a system$ ?ith this techni)ue& you can either participate in& or !atch a person perform activities to learn about the system$ .his techni)ue is particularly useful !hen the validity of data

2@

Database Solutions (2nd Edition)


collected through other methods is in )uestion or !hen the complexity of certain aspects of the system prevents a clear explanation by the end/users$

A useful fact/finding techni)ue is to research the application and problem$ 4omputer trade 'ournals& reference boo s& and the 9nternet are good sources of information$ .hey can provide you !ith information on ho! others have solved similar problems& plus you can learn !hether or not soft!are pac ages exist to solve your problem$

Another fact/finding techni)ue is to conduct surveys through questionnaires$ *uestionnaires are special/purpose documents that allo! you to gather facts from a large number of people

2A

Database Solutions (2nd Edition)


!hile maintaining some control over their responses$ ?hen dealing !ith a large audience& no other fact/finding techni)ue can tabulate the same facts as efficiently$

-.* !escribe the purpose of definin" a mission statement and mission ob4ectives for a database system. .he mission statement defines the ma'or aims of the database system$ .hose driving the database pro'ect !ithin the business (such as the Director and(or o!ner) normally define the mission statement$ A mission statement helps to clarify the purpose of the database pro'ect and provides a clearer path to!ards the efficient and effective creation of the re)uired database system$ >nce the mission statement is defined& the next activity involves identifying the mission ob'ectives$ Each mission ob'ective should identify a particular tas that the database must

support$ .he assumption is that if the database supports the mission ob'ectives then the mission statement should be met$ .he mission statement and ob'ectives may be accompanied !ith additional information that specifies& in general terms& the !or to be done& the resources !ith !hich to do it& and the money to pay for it all$

2F

Database Solutions (2nd Edition)


-.- 9hat is the purpose of the systems definition sta"e3 .he purpose of the system definition stage is to identify the scope and boundary of the database system and its ma'or user vie!s$ Defining the scope and boundary of the database system helps to identify the main types of data mentioned in the intervie!s and a rough guide as to ho! this data is related$ A user vie! represents the re)uirements that should be supported by a database system as defined by a particular 'ob role (such as -anager or Assistant) or business application area (such as video rentals or stoc control)$ -.. 7ow do the contents of a users< requirements specification differ from a systems specification3 .here are t!o main documents created during the re)uirements collection and analysis stage& namely the users" re)uirements specification and the systems specification$ .he users" re)uirements specification describes in detail the data to be held in the database and ho! the data is to be used$ .he systems specification describes any features to be included in the database system such as the re)uired performance and the levels of security$ -.5 !escribe one approach to decidin" whether to use centrali>ed1 view inte"ration1 or a combination of both when developin" a database system for multiple user views. >ne !ay to help you ma e a decision !hether to use the centrali#ed& vie! integration& or a combination of both approaches to manage multiple user vie!s is to examine the overlap in terms of the data used bet!een the user vie!s identified during the system definition stage$ 9t"s difficult to give precise rules as to !hen it"s appropriate to use the centrali#ed or vie! integration approaches$ As the database developer& you should base your decision on an assessment of the complexity of the database system and the degree of overlap bet!een the various user vie!s$ 3o!ever& !hether you use the centrali#ed or vie! integration approach or a mixture of both to build the underlying database& ultimately you need to create the original user vie!s for the !or ing database system$

2:

Database Solutions (2nd Edition)

Chapter . 6ntity-Relationship )odelin" - Review questions


..1 !escribe what entities represent in an 6R model and provide examples of entities with a physical or conceptual existence. Entity is a set of ob'ects !ith the same properties& !hich are identified by a user or company as having an independent existence$ Each ob'ect& !hich should be uni)uely identifiable !ithin the set& is called an entity occurrence$ An entity has an independent existence and can represent ob'ects !ith a physical (or 6real") existence or ob'ects !ith a conceptual (or 6abstract") existence$

..

!escribe what relationships represent in an 6R model and provide examples of unary1 binary1 and ternary relationships.

<elationship is a set of meaningful associations among entities$ As !ith entities& each association should be uni)uely identifiable !ithin the set$ A uni)uely identifiable association is called a relationship occurrence$ Each relationship is given a name that describes its function$ %or example& the Actor entity is associated !ith the <ole entity through a relationship called 1lays& and the <ole entity is associated !ith the 0ideo entity through a relationship called %eatures$

.he entities involved in a particular relationship are referred to as participants$ .he number of participants in a relationship is called the degree and indicates the number of entities involved in a relationship$ A relationship of degree one is called unary& !hich is commonly referred to as a recursive relationship$ A unary relationship describes a relationship !here the same entity participates more than once in different roles$ An example of a unary relationship is Supervises&

5;

Database Solutions (2nd Edition)


!hich represents an association of staff !ith a supervisor !here the supervisor is also a member of staff$ 9n other !ords& the Staff entity participates t!ice in the Supervises relationship; the first participation as a supervisor& and the second participation as a member of staff !ho is supervised (supervisee)$ See %igure A$8 for a diagrammatic representation of the

Supervises relationship$

A relationship of degree t!o is called binary$

A relationship of a degree higher than binary is called a complex relationship$

A relationship of

degree three is called ternary$ An example of a ternary relationship is <egisters !ith three participating entities& namely ,ranch& Staff& and -ember$ .he purpose of this relationship is to represent the situation !here a member of staff registers a member at a particular branch& allo!ing for members to register at more than one branch& and members of staff to move bet!een branches$

52

Database Solutions (2nd Edition)

=i"ure ..' Example of a ternary relationship called <egisters$ ..& !escribe what attributes represent in an 6R model and provide examples of simple1 composite1 sin"le-value1 multi-value1 and derived attributes. An attribute is a property of an entity or a relationship$ Attributes represent !hat !e !ant to no! about entities$ %or example& a 0ideo entity may be described by the catalogBo& title& category& daily<ental& and price attributes$ .hese attributes hold values that describe each video occurrence& and represent the main source of data stored in the database$ Simple attribute is an attribute composed of a single component$ Simple attributes cannot be further subdivided$ Examples of simple attributes include the category and price attributes for a video$ Composite attribute is an attribute composed of multiple components$ 4omposite attributes can be further divided to yield smaller components !ith an independent existence$ %or example& the name attribute of the -ember entity !ith the value 6Don Belson" can be subdivided into fBame (6Don") and lBame (6Belson")$ Sin"le-valued attribute is an attribute that holds a single value for an entity occurrence$ .he ma'ority of attributes are single/valued for a particular entity$ %or example& each occurrence of the 0ideo entity has a single/value for the catalogBo attribute (for example& 2;A252)& and therefore the catalogBo attribute is referred to as being single/valued$ )ulti-valued attribute is an attribute that holds multiple values for an entity occurrence$ Some attributes have multiple values for a particular entity$ %or example& each occurrence of the 0ideo entity may have multiple values for the category attribute (for example& 64hildren" and 64omedy")& and therefore the category attribute in this case !ould be multi/valued$ A multi/ valued attribute may have a set of values !ith specified lo!er and upper limits$ %or example& the category attribute may have bet!een one and three values$

52

Database Solutions (2nd Edition)


!erived attribute is an attribute that represents a value that is derivable from the value of a related attribute& or set of attributes& not necessarily in the same entity$ Some attributes may be related for a particular entity$ %or example& the age of a member of staff (age) is derivable from the date of birth (D>,) attribute& and therefore the age and D>, attributes are related$ ?e refer to the age attribute as a derived attribute& the value of !hich is derived from the D>, attribute$ ..' !escribe what multiplicity represents for a relationship. )ultiplicity is the number of occurrences of one entity that may relate to a single occurrence of an associated entity$ ..* 9hat are business rules and how does multiplicity model these constraints3 -ultiplicity constrains the number of entity occurrences that relate to other entity occurrences through a particular relationship$ -ultiplicity is a representation of the policies established by the user or company& and is referred to as a business rule$ Ensuring that all appropriate business rules are identified and represented is an important part of modeling a company$ .he multiplicity for a binary relationship is generally referred to as one/to/one (2:2)& one/to/ many (2:D)& or many/to/many (D:D)$ Examples of three types of relationships include: A member of staff manages a branch$ A branch has members of staff$ Actors play in videos$

..- 7ow does multiplicity represent both the cardinality and the participation constraints on a relationship3 -ultiplicity actually consists of t!o separate constraints no!n as cardinality and participation$ Cardinality describes the number of possible relationships for each participating entity$ ,articipation determines !hether all or only some entity occurrences participate in a relationship$ .he cardinality of a binary relationship is !hat !e have been referring to as one/ to/one& one/to/many& and many/to/many$ A participation constraint represents !hether all entity occurrences are involved in a particular relationship (mandatory participation) or only some

55

Database Solutions (2nd Edition)


(optional participation)$ .he cardinality and participation constraints for the Staff -anages ,ranch relationship are sho!n in %igure A$22$

... ,rovide an example of a relationship with attributes. An example of a relationship !ith an attribute is the relationship called 1lays9n& !hich associates the Actor and 0ideo entities$ ?e may !ish to record the character played by an actor in a given video$ .his information is associated !ith the 1lays9n relationship rather than the Actor or 0ideo entities$ ?e create an attribute called character to store this information and assign it to the 1lays9n relationship& as illustrated in %igure A$22$ Bote& in this figure the character attribute is sho!n using the symbol for an entity; ho!ever& to distinguish bet!een a relationship !ith an attribute and an entity& the rectangle representing the attribute is associated !ith the relationship using a dashed line$

57

Database Solutions (2nd Edition)

%igure A$22 A relationship called 1lays9n !ith an attribute called character$ ..5 !escribe how stron" and wea2 entities differ and provide an example of each. ?e can classify entities as being either strong or !ea $ A s tron" entity is not dependent on the existence of another entity for its primary ey$ A wea2 entity is partially or !holly dependent ey$ %or example& as !e can

on the existence of another entity& or entities& for its primary

distinguish one actor from all other actors and one video from all other videos !ithout the existence of any other entity& Actor and 0ideo are referred to as being strong entities$ 9n other !ords& the Actor and 0ideo entities are strong because they have their o!n primary eys$ An example of a !ea entity called <ole& !hich represents characters played by actors in videos$ 9f !e are unable to uni)uely identify one <ole entity occurrence from another !ithout the existence of the Actor and 0ideo entities& then <ole is referred to as being a !ea entity$ 9n other !ords& the <ole entity is !ea because it has no primary ey of its o!n$

58

Database Solutions (2nd Edition)

=i"ure ..- Diagrammatic representation of attributes for the 0ideo& <ole& and Actor entities$

Strong entities are sometimes referred to as parent& o!ner& or dominant entities and !ea entities as child& dependent& or subordinate entities$

..A !escribe how fan and chasm traps can occur in an 6R model and how they can be resolved. %an and chasm traps are t!o types of connection traps that can occur in E< models$ .he traps normally occur due to a misinterpretation of the meaning of certain relationships$ 9n general& to identify connection traps !e must ensure that the meaning of a relationship (and the business rule that it represents) is fully understood and clearly defined$ 9f !e don"t understand the relationships !e may create a model that is not a true representation of the 6real !orld"$

A fan trap may occur !hen t!o entities have a 2:D relationship that fan out from a third entity& but the t!o entities should have a direct relationship bet!een them to provide the necessary information$ A fan trap may be resolved through the addition of a direct relationship bet!een the t!o entities that !ere originally separated by the third entity$

5@

Database Solutions (2nd Edition)


A chasm trap may occur !hen an E< model suggests the existence of a relationship bet!een

entities& but the path!ay does not exist bet!een certain entity occurrences$ -ore specifically& a chasm trap may occur !here there is a relationship !ith optional participation that forms part of the path!ay bet!een the entities that are related$ Again& a chasm trap may be resolved by the addition of a direct relationship bet!een the t!o entities that !ere originally related through a path!ay that included optional participation$

5A

Database Solutions (2nd Edition)

Chapter 5 Dormali>ation E Review questions


5.1 !iscuss how normali>ation may be used in database desi"n. Bormali#ation can be used in database design in t!o !ays: the first is to use normali#ation as a bottom/up approach to database design; the second is to use normali#ation in con'unction !ith E< modeling$ =sing normali#ation as a bottom-up approach involves analy#ing the associations bet!een attributes and& based on this analysis& grouping the attributes together to form tables that represent entities and relationships$ 3o!ever& this approach becomes difficult !ith a large number of attributes& !here it"s difficult to establish all the important associations bet!een the attributes$ Alternatively& you can use a top-down approach to database design$ 9n this approach& !e use E< modeling to create a data model that represents the main entities and relationships$ ?e then translate the E< model into a set of tables that represents this data$ 9t"s at this point that !e use normali#ation to chec !hether the tables are !ell designed$ 5. !escribe the types of update anomalies that may occur on a table that has redundant data. .ables that have redundant data may have problems called update anomalies& !hich are classified as insertion& deletion& or modification anomalies$ See %igure F$2 for an example of a table !ith redundant data called Staff,ranch$ .here are t!o main types of insertion anomalies& !hich !e illustrate using this table$ 9nsertion anomalies (2) .o insert the details of a ne! member of staff located at a given branch into the Staff,ranch table& !e must also enter the correct details for that branch$ %or example& to insert the details of a ne! member of staff at branch ,;;2& !e must enter the correct details of branch ,;;2 so that the branch details are consistent !ith values for branch ,;;2 in other records of the Staff,ranch table$ .he data sho!n in the Staff,ranch table is also sho!n in the Staff and ,ranch tables sho!n in %igure F$2$ .hese tables do have redundant data and do not suffer from this potential inconsistency& because for each staff member !e only enter the appropriate branch number into the Staff table$ 9n addition& the

5F

Database Solutions (2nd Edition)


details of branch ,;;2 are recorded only once in the database as a single record in the ,ranch table$ (2) .o insert details of a ne! branch that currently has no members of staff into the Staff,ranch table& it"s necessary to enter nulls into the staff/related columns& such as staffBo$ 3o!ever& as staffBo is the primary ey for the Staff,ranch table& attempting to enter nulls for staffBo violates entity integrity& and is not allo!ed$ .he design of the tables sho!n in %igure F$2 avoids this problem because ne! branch details are entered into the ,ranch table separately from the staff details$ .he details of staff ultimately located at a ne! branch can be entered into the Staff table at a later date$ !eletion anomalies 9f !e delete a record from the Staff,ranch table that represents the last member of staff located at a branch& the details about that branch are also lost from the database$ %or example& if !e delete the record for staff Art 1eters (S;728) from the Staff,ranch table& the details relating to branch ,;;5 are lost from the database$ .he design of the tables in %igure F$2 avoids this problem because branch records are stored separately from staff records and only the column branchBo relates the t!o tables$ 9f !e delete the record for staff Art 1eters (S;728) from the Staff table& the details on branch ,;;5 in the ,ranch table remain unaffected$ )odification anomalies 9f !e !ant to change the value of one of the columns of a particular branch in the Staff,ranch table& for example the telephone number for branch ,;;2& !e must update the records of all staff located at that branch$ 9f this modification is not carried out on all the appropriate records of the Staff,ranch table& the database !ill become inconsistent$ 9n this example& branch ,;;2 !ould have different telephone numbers in different staff records$ .he above examples illustrate that the Staff and ,ranch tables of %igure F$2 have more desirable properties than the Staff,ranch table of %igure F$2$ 9n the follo!ing sections& !e examine ho! normal forms can be used to formali#e the identification of tables that have desirable properties from those that may potentially suffer from update anomalies$

5:

Database Solutions (2nd Edition)

5.& !escribe the characteristics of a table that violates first normal form $1D=% and then describe how such a table is converted to 1D=. .he rule for first normal form $1D=% is a table in !hich the intersection of every column and record contains only one value$ 9n other !ords a table that contains more than one atomic value in the intersection of one or more column for one or more records is not in 2B%$ .he non 2B% table can be converted to 2B% by restructuring original table by removing the column !ith the multi/values along !ith a copy of the primary ey to create a ne! table$ See %igure F$7 for an example of this approach$ .he advantage of this approach is that the resultant tables may be in normal forms later that 2B%$ 5.' 9hat is the minimal normal form that a relation must satisfy3 ,rovide a definition for this normal form. >nly first normal form (2B%) is critical in creating appropriate tables for relational databases$ All the subse)uent normal forms are optional$ 3o!ever& to avoid the update anomalies discussed in Section F$2& it"s normally recommended that you proceed to third normal form (5B%)$ =irst normal form $1D=% is a table in !hich the intersection of every column and record contains only one value$ 5.* !escribe an approach to convertin" a first normal form $1D=% table to second normal form $ D=% table$s%. Second normal form applies only to tables !ith composite primary eys& that is& tables !ith a primary ey composed of t!o or more columns$ A 2B% table !ith a single column primary ey is automatically in at least 2B%$ A second normal form $ D=% is a table that is already in 2B% and in !hich the values in each non/primary/ ey column can be !or ed out from the values in all the columns that ma es up the primary ey$ A table in 2B% can be converted into 2B% by removing the columns that can be !or ed out from only part of the primary ey$ .hese columns are placed in a ne! table along !ith a copy of the part of the primary ey that they can be !or ed out from$

7;

Database Solutions (2nd Edition)

5.- !escribe the characteristics of a table in second normal form $ D=%. Second normal form $ D=% is a table that is already in 2B% and in !hich the values in each non/ primary/ ey column can only be !or ed out from the values in all the columns that ma e up the primary ey$ 5.. !escribe what is meant by full functional dependency and describe how this type of dependency relates to D=. ,rovide an example to illustrate your answer.

.he formal definition of second normal form $ D=% is a table that is in first normal form and every non/primary/ ey column is fully functionally dependent on the primary ey$ %ull functional dependency indicates that if A and , are columns of a table& , is fully functionally dependent on A& if , is not dependent on any subset of A$ 9f , is dependent on a subset of A& this is referred to as a partial dependency$ 9f a partial dependency exists on the primary ey& the table is not in 2B%$ .he partial dependency must be removed for a table to achieve 2B%$ See Section 5.' for an example$ 5.5 !escribe the characteristics of a table in third normal form $&D=%. +hird normal form $&D=% is a table that is already in 2B% and 2B%& and in !hich the values in all non/primary/ ey columns can be !or ed out from only the primary ey (or candidate column(s) and no other columns$ 5.A !escribe what is meant by transitive dependency and describe how this type of dependency relates to &D=. ,rovide an example to illustrate your answer. .he formal definition for third normal form $&D=% is a table that is in first and second normal forms and in !hich no non/primary/ ey column is transitively dependent on the primary ey$ ey)

.ransitive dependency is a type of functional dependency that occurs !hen a particular type of relationship holds bet!een columns of a table$ %or example& consider a table !ith columns A& ,& and 4$ 9f , is functionally dependent on A (A K ,) and 4 is functionally dependent on , (, K 4)& then 4 is transitively dependent on A via , (provided that A is not functionally dependent on , or 4)$ 9f a transitive dependency exists on the primary ey& the table is not in 5B%$ .he

transitive dependency must be removed for a table to achieve 5B%$ See Section 5.* for an example$

72

Database Solutions (2nd Edition)

Chapter A Lo"ical !atabase !esi"n E Step 1- Review questions


A.1 !escribe the purpose of a desi"n methodolo"y. A design methodology is a structured approach that uses procedures& techni)ues& tools& and

documentation aids to support and facilitate the process of design$ A. !escribe the main phases involved in database desi"n.

Database design is made up of t!o main phases: logical and physical database design$ Lo"ical database desi"n is the process of constructing a model of the data used in a company based on a specific data model& but independent of a particular D,-S and other physical considerations$ 9n the logical database design phase !e build the logical representation of the database& !hich includes identification of the important entities and relationships& and then translate this representation to a set of tables$ .he logical data model is a source of information for the physical design phase& providing the physical database designer !ith a vehicle for ma ing tradeoffs that are very important to the design of an efficient database$ ,hysical database desi"n is the process of producing a description of the implementation of the database on secondary storage; it describes the base tables& file organi#ations& and indexes used to achieve efficient access to the data& and any associated integrity constraints and security restrictions$ 9n the physical database design phase !e decide ho! the logical design is to be physically implemented in the target relational D,-S$ .his phase allo!s the designer to ma e decisions on ho! the database is to be implemented$ .herefore& physical design is tailored to a specific D,-S$ A.& Identify important factors in the success of database desi"n. .he follo!ing are important factors to the success of database design: ?or interactively !ith the users as much as possible$ %ollo! a structured methodology throughout the data modeling process$ Employ a data/driven approach$ 9ncorporate structural and integrity considerations into the data models$ =se normali#ation and transaction validation techni)ues in the methodology$ =se diagrams to represent as much of the data models as possible$

72

Database Solutions (2nd Edition)


=se a database design language (D,D+)$ ,uild a data dictionary to supplement the data model diagrams$ ,e !illing to repeat steps$

A.' !iscuss the important role played by users in the process of database desi"n. =sers play an essential role in confirming that the logical database design is meeting their re)uirements$ +ogical database design is made up of t!o steps and at the end of each step (Steps 2$: and 2$8) users are re)uired to revie! the design and provide feedbac to the

designer$ >nce the logical database design has been 6signed off" by the users the designer can continue to the physical database design stage$ A.* !iscuss the main activities associated with each step of the lo"ical database desi"n methodolo"y. .he logical database design phase of the methodology is divided into t!o main steps$ 9n Step 2 !e create a data model and chec that the data model has minimal redundancy and is capable of supporting user transactions$ .he output of this step is the creation of a logical data model& !hich is a complete and accurate representation of the company (or part of the company) that is to be supported by the database$ 9n Step 2 !e map the E< model to a set of tables$ .he structure of each table is chec ed using normali#ation$ Bormali#ation is an effective means of ensuring that the tables are structurally consistent& logical& !ith minimal redundancy$ .he tables are also chec ed to ensure that they are capable of supporting the re)uired transactions$ .he re)uired integrity constraints on the database are also defined$ A.- !iscuss the main activities associated with each step of the physical database desi"n methodolo"y. 1hysical database design is divided into six main steps:

Step 5 involves the design of the base tables and integrity constraints using the available
functionality of the target D,-S$

Step 7 involves choosing the file organi#ations and indexes for the base tables$ .ypically&
D,-Ss provide a number of alternative file organi#ations for data& !ith the exception of 14 D,-Ss& !hich tend to have a fixed storage structure$

75

Database Solutions (2nd Edition)

Step 8 involves the design of the user vie!s originally identified in the re)uirements
analysis and collection stage of the database system development lifecycle$

Step @ involves designing the security measures to protect the data from unauthori#ed access$

Step A considers relaxing the normali#ation constraints imposed on the tables to improve
the overall performance of the system$ .his is a step that you should underta e only if necessary& because of the inherent problems involved in introducing redundancy !hile still maintaining consistency$

Step F is an ongoing process of monitoring and tuning the operational system to identify and
resolve any performance problems resulting from the design and to implement ne! or changing re)uirements$

A.. !iscuss the purpose of Step 1 of lo"ical database desi"n. 1urpose of Step 2 is to build a logical data model of the data re)uirements of a company (or part of a company) to be supported by the database$ Each logical data model comprises: entities& relationships& attributes and attribute domains& primary eys and alternate eys& integrity constraints$ .he logical data model is supported by documentation& including a data dictionary and E< diagrams& !hich you"ll produce throughout the development of the model$ A.5 Identify the main tas2s associated with Step 1 of lo"ical database desi"n. Step 2 4reate and chec E< model Step 2$2 9dentify entities Step 2$2 9dentify relationships Step 2$5 9dentify and associate attributes !ith entities or relationships Step 2$7 Determine attribute domains Step 2$8 Determine candidate& primary& and alternate ey attributes

77

Database Solutions (2nd Edition)


Step 2$@ Speciali#e(Cenerali#e entities (optional step) Step 2$A 4hec model for redundancy Step 2$F 4hec model supports user transactions Step 2$: <evie! model !ith users

A.A !iscuss an approach to identifyin" entities and relationships from a users< requirements specification. Identifyin" entities >ne method of identifying entities is to examine the users" re)uirements specification$ %rom this specification& you can identify nouns or noun phrases that are mentioned (for example& staff number& staff name& catalog number& title& daily rental rate& purchase price)$ Eou should also loo for ma'or ob'ects such as people& places& or concepts of interest& excluding those

nouns that are merely )ualities of other ob'ects$ %or example& you could group staff number and staff name !ith an entity called Staff and group catalog number& title& daily rental rate& and purchase price !ith an entity called 0ideo$ An alternative !ay of identifying entities is to loo for ob'ects that have an existence in their o!n right$ %or example& Staff is an entity because staff exists !hether or not you no! their names& addresses& and salaries$ 9f possible& you should get the user to assist !ith this activity$

Identifyin" relationships 3aving identified the entities& the next step is to identify all the relationships that exist bet!een these entities$ ?hen you identify entities& one method is to loo for nouns in the users" re)uirements specification$ Again& you can use the grammar of the re)uirements specification to identify relationships$ .ypically& relationships are indicated by verbs or verbal expressions$ %or example: ,ranch 3as Staff ,ranch 9sAllocated 0ideo%or<ent 0ideo%or<ent 9s1art>f <entalAgreement

.he fact that the users" re)uirements specification records these relationships suggests that they are important to the users& and should be included in the model$ .a e great care to ensure that all the relationships that are either explicit or implicit in the users" re)uirements specification are noted$ 9n principle& it should be possible to chec each pair

78

Database Solutions (2nd Edition)


of entities for a potential relationship bet!een them& but this !ould be a daunting tas for a large system comprising hundreds of entities$ >n the other hand& it"s un!ise not to perform some such chec $ 3o!ever& missing relationships should become apparent !hen you chec the model supports the transactions that the users re)uire$ >n the other hand& it is possible that an entity can have no relationship !ith other entities in the database but still play an important part in meeting the user"s re)uirements$ A.1B !iscuss an approach to identifyin" attributes from a users< requirements

specification and the association of attributes with entities or relationships. 9n a similar !ay to identifying entities& loo for nouns or noun phrases in the users" re)uirements specification$ .he attributes can be identified !here the noun or noun phrase is a property& )uality& identifier& or characteristic of one of the entities or relationships that you"ve previously found$

,y far the easiest thing to do !hen you"ve identified an entity or a relationship in the users" re)uirements specification is to consider L?hat information are !e re)uired to hold on $ $ $JM$ .he ans!er to this )uestion should be described in the specification$ 3o!ever& in some cases& you may need to as the users to clarify the re)uirements$ =nfortunately& they may give you ans!ers that also contain other concepts& so users" responses must be carefully considered$ A.11 !iscuss an approach to chec2in" a data model for redundancy. 8ive an example to illustrate your answer. .here are three approaches to identifying !hether a data model suffers from redundancy: (2) (2) (5) re/examining one/to/one (2:2) relationships; removing redundant relationships; considering the time dimension !hen assessing redundancy$

3o!ever& to ans!er this )uestion you need only describe one approach$ ?e describe approach (2) here$

?n example of approach $1% 9n the identification of entities& you may have identified t!o entities that represent the same ob'ect in the company$ %or example& you may have identified t!o entities named ,ranch and

7@

Database Solutions (2nd Edition)


>utlet that are actually the same; in other !ords& ,ranch is a synonym for >utlet$ 9n this case& the t!o entities should be merged together$ 9f the primary eys are different& choose one of them to be the primary ey and leave the other as an alternate ey$ A.1 !escribe two approaches to chec2in" that a lo"ical data model supports the transactions required by the user. .he t!o possible approaches to ensuring that the logical data model supports the re)uired transactions& includes: $1% !escribin" the transaction =sing the first approach& you chec that all the information (entities& relationships& and their attributes) re)uired by each transaction is provided by the model& by documenting a description of each transaction"s re)uirements$ $ % :sin" transaction pathways .he second approach to validating the data model against the re)uired transactions involves representing the path!ay ta en by each transaction directly on the E< diagram$ 4learly& the more transactions that exist& the more complex this diagram !ould become& so for readability you may need several such diagrams to cover all the transactions$ A.1& Identify and describe the purpose of the documentation "enerated durin" Step 1 of lo"ical database desi"n. !ocument entities .he data dictionary describes the entities including the entity name& description& aliases& and occurrences$

7A

Database Solutions (2nd Edition)

=i"ure A.

Extract from the data dictionary for the ,ranch user vie!s of Stay3ome sho!ing a

description of entities$

6R dia"rams .hroughout the database design phase& E< diagrams are used !henever necessary& to help build up a picture of !hat you"re attempting to model$ Different people use different notations for E< diagrams$ 9n this boo & !e"ve used the latest ob'ect/oriented notation called :)L $:nified )odelin" Lan"ua"e%& but other notations perform a similar function$

!ocument relationships As you identify relationships& assign them names that are meaningful and obvious to the user& and also record relationship descriptions& and the multiplicity constraints in the data dictionary$

7F

Database Solutions (2nd Edition)

=i"ure A.. Extract from the data dictionary for the ,ranch user vie!s of Stay3ome sho!ing descriptions of relationships$

!ocument attributes As you identify attributes& assign them names that are meaningful and obvious to the user$ ?here appropriate& record the follo!ing information for each attribute: attribute name and description; data type and length; any aliases that the attribute is no!n by; !hether the attribute must al!ays be specified (in other !ords& !hether the attribute allo!s or disallo!s nulls); !hether the attribute is multi/valued; !hether the attribute is composite& and if so& !hich simple attributes ma e up the composite attribute; !hether the attribute is derived and& if so& ho! it should be computed; default values for the attribute (if specified)$

7:

Database Solutions (2nd Edition)

=i"ure A.5 Extract from the data dictionary for the ,ranch user vie!s of Stay3ome sho!ing descriptions of attributes$

!ocument attribute domains As you identify attribute domains& record their names and characteristics in the data dictionary$ =pdate the data dictionary entries for attributes to record their domain in place of the data type and length information$

!ocument candidate1 primary1 and alternate 2eys <ecord the identification of candidate& primary& and alternate eys (!hen available) in the data dictionary$

8;

Database Solutions (2nd Edition)

=i"ure A.1B Extract from the data dictionary for the ,ranch user vie!s of Stay3ome sho!ing attributes !ith primary and alternate eys identified$

!ocument entities Eou no! have a logical data model that represents the database re)uirements of the company (or part of the company)$ .he logical data model is chec ed to ensure that the model supports the re)uired transactions$ .his process creates documentation that ensures that all the information (entities& relationships& and their attributes) re)uired by each transaction is provided by the model& by documenting a description of each transaction"s re)uirements$ Alternative approach to validating the data model against the re)uired transactions involves representing the path!ay ta en by each transaction directly on the E< diagram$ 4learly& the more transactions that exist& the more complex this diagram !ould become& so for readability you may need several such diagrams to cover all the transactions$

82

Database Solutions (2nd Edition)

Chapter 1B Lo"ical !atabase !esi"n E Step


1B.1 !escribe the main purpose and tas2s of Step methodolo"y.

E Review questions
of the lo"ical database desi"n

.o create tables for the logical data model and to chec the structure of the tables$ .he tas s involved in Step 2 are: 1B. Step 2$2 4reate tables Step 2$2 4hec table structures using normali#ation Step 2$5 4hec tables support user transactions Step 2$7 4hec business rules Step 2$8 <evie! logical database design !ith users !escribe the rules for creatin" tables that represent# (a) strong and !ea entities; (b) one/to/many (2:D) binary relationships; (c) one/to/many (2:D) recursive relationships; (d) one/to/one (2:2) binary relationships; (e) one/to/one (2:2) recursive relationships; (f) many/to/many (D:D) binary relationships; (g) complex relationships; (h) multi/valued attributes$ Cive examples to illustrate your ans!ers$

82

Database Solutions (2nd Edition)

Examples are provided throughout the description of Step 2$2 in 4hapter 2;$ 1B.& !iscuss how the technique of normali>ation can be used to chec2 the structure of the tables created from the 6R model and supportin" documentation. .he purpose of the techni)ue of normali#ation to examine the groupings of columns in each table created in Step 2$2$ Eou chec the composition of each table using the rules of normali#ation& to avoid unnecessary duplication of data$ Eou should ensure that each table created is in at least third normal form (5B%)$ 9f you identify tables that are not in 5B%& this may indicate that part of the E< model is incorrect& or that you have introduced an error !hile creating the tables from the model$ 9f necessary& you may need to restructure the data model and(or tables$ 1B.' !iscuss one approach that can be used to chec2 that the tables support the transactions required by the users. >ne approach to chec ing that the tables support a transaction is to examine the transaction"s data re)uirements to ensure that the data is present in one or more tables$ Also& if a

85

Database Solutions (2nd Edition)


transaction re)uires data in more than one table you should chec that these tables are lin ed through the primary ey(foreign ey mechanism$ 1B.* !iscuss what business rules represent. 8ive examples to illustrate your answers.

,usiness rules are the constraints that you !ish to impose in order to protect the database from becoming incomplete& inaccurate& or inconsistent$ Although you may not be able to implement some business rules !ithin the D,-S& this is not the )uestion here$ At this stage& you are concerned only !ith high/level design that is& specifying !hat business rules are re)uired irrespective of ho! this might be achieved$ 3aving identified the business rules& you !ill have a logical data model that is a complete and accurate representation of the organi#ation (or part of the organi#ation) to be supported by the database$ 9f necessary& you could produce a physical database design from the logical data model& for example& to prototype the system for the user$ ?e consider the follo!ing types of business rules: re)uired data& column domain constraints& entity integrity& multiplicity& referential integrity& other business rules$ !escribe the alternative strate"ies that can be applied if there is a child record referencin" a parent record that we wish to delete. 9f a record of the parent table is deleted& referential integrity is lost if there is a child record referencing the deleted parent record$ 9n other !ords& referential integrity is lost if the deleted branch currently has one or more members of staff !or ing at it$ .here are several strategies you can consider in this case: B> A4.9>B 1revent a deletion from the parent table if there are any referencing child

1B.*

records$ 9n our example& 6Eou cannot delete a branch if there are currently members of staff !or ing there"$ 4AS4ADE ?hen the parent record is deleted& automatically delete any referencing child

records$ 9f any deleted child record also acts as a parent record in another relationship then the delete operation should be applied to the records in this child table& and so on in a cascading manner$ 9n other !ords& deletions from the parent table cascade to the child table$ 9n our

87

Database Solutions (2nd Edition)


example& 6Deleting a branch automatically deletes all members of staff !or ing there"$ 4learly& in this situation& this strategy !ould not be !ise$ SE. B=++ ?hen a parent record is deleted& the foreign ey values in all related child

records are automatically set to null$ 9n our example& 69f a branch is deleted& indicate that the current branch for those members of staff previously !or ing there is un no!n"$ Eou can only consider this strategy if the columns comprising the foreign ey can accept nulls& as defined in Step 2$5$ SE. DE%A=+. ?hen a parent record is deleted& the foreign ey values in all related child

records are automatically set to their default values$ 9n our example& 69f a branch is deleted& indicate that the current assignment of members of staff previously !or ing there is being assigned to another (default) branch"$ Eou can only consider this strategy if the columns comprising the foreign ey have default values& as defined in Step 2$5$ B> 43E4N ?hen a parent record is deleted& do nothing to ensure that referential integrity

is maintained$ .his strategy should only be considered in extreme circumstances$ 1B.!iscuss what business rules represent. 8ive examples to illustrate your answers.

%inally& you consider constraints no!n as business rules$ ,usiness rules should be represented as constraints on the database to ensure that only permitted updates to tables governed by 6real !orld" transactions are allo!ed$ %or example& Stay3ome has a business rule that prevents a member from renting more than 2; videos at any one time$

88

Database Solutions (2nd Edition)

Chapter 11 6nhanced 6ntity-Relationship )odelin" E Review questions


11.1 !escribe what a superclass and a subclass represent. is an entity that includes one or more distinct groupings of its occurrences& !hich is a distinct grouping of occurrences of an

Superclass

re)uire to be represented in a data model$ Subclass

entity& !hich re)uire to be represented in a data model$

11.

!escribe the relationship between a superclass and its subclass.

.he relationship bet!een a superclass and any one of its subclasses is one/to/one (2:2) and is called a superclass(subclass relationship$ %or example& Staff(-anager forms a

superclass(subclass relationship$ Each member of a subclass is also a member of the superclass but has a distinct role$

11.&

!escribe and illustrate usin" an example the process of attribute inheritance.

An entity occurrence in a subclass represents the same 6real !orld" ob'ect as in the superclass$ 3ence& a member of a subclass inherits those attributes associated !ith the superclass& but may also have subclass/specific attributes$ %or example& a member of the Sales1ersonnel subclass has subclass/specific attributes& salesArea& veh+icenseBo& and carAllo!ance& and all the attributes of the Staff superclass& namely staffBo& name& position& salary& and branchBo$

11.'

9hat are the main reasons for introducin" the concepts of superclasses and subclasses into an 66R model3

.here are t!o important reasons for introducing the concepts of superclasses and subclasses into an E< model$ .he first reason is that it avoids describing similar concepts more than once& thereby saving you time and ma ing the E< model more readable$ .he second reason is that it adds more semantic information to the design in a form that is familiar to many people$ %or example& the assertions that 6-anager 9S/A member of staff" and 6van 9S/A type of vehicle" communicate significant semantic content in an easy/to/follo! form$

8@

Database Solutions (2nd Edition)

11.*

!escribe what a shared subclass represents.

A subclass is an entity in its o!n right and so it may also have one or more subclasses$ A subclass !ith more than one superclass is called a shared subclass$ 9n other !ords& a member of a shared subclass must be a member of the associated superclasses$ As a conse)uence& the attributes of the superclasses are inherited by the shared subclass& !hich may also have its o!n additional attributes$ .his process is referred to as multiple inheritance$ 11.!escribe and contrast the process of speciali>ation with the process of

"enerali>ation. Speciali>ation is the process of maximi#ing the differences bet!een members of an entity by

identifying their distinguishing characteristics$ Speciali#ation is a top/do!n approach to defining a set of superclasses and their related subclasses$ .he set of subclasses is defined on the basis of some distinguishing characteristics of the entities in the superclass$ ?hen !e identify a subclass of an entity& !e then associate attributes specific to the subclass (!here necessary)& and also identify any relationships bet!een the subclass and other entities or subclasses (!here necessary)$ 8enerali>ation is the process of minimi#ing the differences bet!een entities by identifying

their common features$ .he process of generali#ation is a bottom/up approach& !hich results in the identification of a generali#ed superclass from the original subclasses$ .he process of generali#ation can be vie!ed as the reverse of the speciali#ation process$

11..

!escribe the two main constraints that apply to a speciali>ationF"enerali>ation relationship.

.here are t!o constraints that may apply to a superclass(subclass relationship called participation constraints and dis'oint constraints$

,articipation

constraint determines !hether every occurrence in the superclass must

participate as a member of a subclass$ A participation constraint may be mandatory or optional$ A superclass(subclass relationship !ith a mandatory participation specifies that every entity occurrence in the superclass must also be a member of a subclass$ A superclass(subclass

8A

Database Solutions (2nd Edition)


relationship !ith optional participation specifies that a member of a superclass need not belong to any of its subclasses$

!is4oint constraint describes the relationship bet!een members of the subclasses and indicates !hether it"s possible for a member of a superclass to be a member of one& or more than one& subclass$ .he dis'oint constraint only applies !hen a superclass has more than one subclass$ 9f the subclasses are dis'oint& then an entity occurrence can be a member of only one of the subclasses$ .o represent a dis'oint superclass(subclass relationship& an 6>r" is placed next to the participation constraint !ithin the curly brac ets$ 9f subclasses of a

speciali#ation(generali#ation are not dis'oint (called nondis'oint)& then an entity occurrence may be a member of more than one subclass$ .he participation and dis'oint constraints of speciali#ation(generali#ation are distinct giving the follo!ing four categories: mandatory and nondis'oint& optional and nondis'oint& mandatory and dis'oint& and optional and dis'oint$

8F

Database Solutions (2nd Edition)

Chapter 1
1 .1

,hysical !atabase !esi"n E Step & E Review questions

6xplain the difference between lo"ical and physical database desi"n. 9hy mi"ht these tas2s be carried out by different people3

+ogical database design is independent of implementation details& such as the specific functionality of the target D,-S& application programs& programming languages& or any other physical considerations$ .he output of this process is a logical data model that includes a set of relational tables together !ith supporting documentation& such as a data dictionary$ .hese represent the sources of information for the physical design process& and they provide you !ith a vehicle for ma ing trade/offs that are so important to an efficient database design$ ?hereas logical database design is concerned !ith the !hat& physical database design is concerned !ith the ho!$ 9n particular& the physical database designer must no! ho! the

computer system hosting the D,-S operates& and must also be fully a!are of the functionality of the target D,-S$ As the functionality provided by current systems varies !idely& physical design must be tailored to a specific D,-S system$ 3o!ever& physical database design is not an isolated activity O there is often feedbac bet!een physical& logical& and application design$ %or example& decisions ta en during physical design to improve performance& such as merging tables together& might affect the logical data model$ 1 . !escribe the inputs and outputs of physical database desi"n.

.he inputs are the logical data model and the data dictionary$ .he outputs are the base tables& integrity rules& file organi#ation specified& secondary indexes determined& user vie!s and security mechanisms$ 1 .& !escribe the purpose of the main steps in the physical desi"n methodolo"y presented in this chapter. Step 5 produces a relational database schema from the logical data model& !hich defines the base tables& integrity rules& and ho! to represent derived data$

8:

Database Solutions (2nd Edition)

1 .'

!escribe the types of information required to desi"n the base tables.

Eou !ill need to no!: ho! to create base tables; !hether the system supports the definition of primary eys; !hether the system supports the definition of re)uired data (that is& !hether the system allo!s columns to be defined as NOT NULL); !hether the system supports the definition of domains; !hether the system supports relational integrity rules; !hether the system supports the definition of business rules$ !escribe how you would you handle the representation of derived data in the database. 8ive an example to illustrate your answer. %rom a physical database design perspective& !hether a derived column is stored in the database or calculated every time it"s needed is a trade/off$ .o decide& you should calculate: the additional cost to store the derived data and !hich it is derived& and the cost to calculate it each time it"s re)uired& eep it consistent !ith the data from eys& foreign eys& and alternate

1 .*

and choose the less expensive option sub'ect to performance constraints$

@;

Database Solutions (2nd Edition)

Chapter 1& ,hysical !atabase !esi"n E Step ' E Review questions


1&.1 !escribe the purpose of Step ' in the database desi"n methodolo"y.

Step 7 determines the file organi#ations for the base tables$ .his ta es account of the nature of the transactions to be carried out& !hich also determine !here secondary indexes !ill be of use$ 1&. !iscuss the purpose of analy>in" the transactions that have to be supported and describe the type of information you would collect and analy>e. Eou can"t ma e meaningful physical design decisions until you understand in detail the transactions that have to be supported$ 9n analy#ing the transactions& you"re attempting to identify performance criteria& such as: the transactions that run fre)uently and !ill have a significant impact on performance; the transactions that are critical to the operation of the business; the times of the day(!ee !hen there !ill be a high demand made on the database (called the pea load)$ Eou"ll use this information to identify the parts of the database that may cause performance problems$ At the same time& you need to identify the high/level functionality of the transactions& such as the columns that are updated in an update transaction or the columns that are retrieved in a )uery$ Eou"ll use this information to select appropriate file organi#ations and indexes$ 1&.& 9hen would you not add any indexes to a table3

(2) Do not index small tables$ 9t may be more efficient to search the table in memory than to store an additional index structure$ (2) Avoid indexing a column or table that is fre)uently updated$ (5) Avoid indexing a column if the )uery !ill retrieve a significant proportion (for example& 28P) of the records in the table& even if the table is large$ 9n this case& it may be more efficient to search the entire table than to search using an index$ (7) Avoid indexing columns that consist of long character strings$

@2

Database Solutions (2nd Edition)


1&.' !iscuss some of the main reasons for selectin" a column as a potential candidate for indexin". 8ive examples to illustrate your answer. (2) 9n general& index the primary ey of a table if it"s not a ey of the file organi#ation$ eys as

Although the S*+ standard provides a clause for the specification of primary

discussed in Step 5$2 covered in the last chapter& note that this does not guarantee that the primary ey !ill be indexed in some <D,-Ss$ (2) Add a secondary index to any column that is heavily used for data retrieval$ %or example& add a secondary index to the -ember table based on the column lBame& as discussed above$ (5) Add a secondary index to a foreign ey if there is fre)uent access based on it$ %or

example& you may fre)uently 'oin the 0ideo%or<ent and ,ranch tables on the column branchBo (the branch number)$ .herefore& it may be more efficient to add a secondary index to the 0ideo%or<ent table based on branchBo$ (7) Add a secondary index on columns that are fre)uently involved in: (a) selection or 'oin criteria; (b) ORDER BY; (c) GROUP BY; (d) other operations involving sorting (such as UNION or DISTINCT)$ (8) Add a secondary index on columns involved in built/in functions& along !ith any columns used to aggregate the built/in functions$ %or example& to find the average staff salary at each branch& you could use the follo!ing S*+ )uery:
SELECT branchNo, AVG(salary) FROM Staff GROUP BY branchNo;

%rom the previous guideline& you could consider adding an index to the branchBo column by virtue of the GROUP BY clause$ 3o!ever& it may be more efficient to consider an index on both the branchBo column and the salary column$ .his may allo! the D,-S to perform the entire )uery from data in the index alone& !ithout having to access the data file$ .his is sometimes called an index/only plan& as the re)uired response can be produced using only data in the index$ (@) As a more general case of the previous guideline& add a secondary index on columns that could result in an index/only plan$

@2

Database Solutions (2nd Edition)


1&.* 7avin" identified a column as a potential candidate1 under what circumstances would you decide a"ainst indexin" it3 3aving dra!n up your 6!ish/list" of potential indexes& consider the impact of each of these on update transactions$ 9f the maintenance of the index is li ely to slo! do!n important update transactions& then consider dropping the index from the list$

@5

Database Solutions (2nd Edition)

Chapter 1' ,hysical !atabase !esi"n E Steps * and - E Review questions


1'.1 !escribe the purpose of the main steps in the physical desi"n methodolo"y presented in this chapter. Step 8 designs the user vie!s for the database implementation$ Step @ designs the security mechanisms for the database implementation$ .his includes designing the access rules on the base relations$ 1'. !iscuss the difference between system security and data security.

System security covers access and use of the database at the system level& such as a username and pass!ord$ !ata security covers access and use of database ob'ects (such as tables and vie!s) and the actions that users can have on the ob'ects$ 1'.& !escribe the access control facilities of S0L.

Each database user is assigned an authori>ation identifier by the Database Administrator (D,A); usually& the identifier has an associated pass!ord& for obvious security reasons$ Every S*+ statement that is executed by the D,-S is performed on behalf of a specific user$ .he authori#ation identifier is used to determine !hich database ob'ects that user may reference& and !hat operations may be performed on those ob'ects$ Each ob'ect that is created in S*+ has an o!ner& !ho is identified by the authori#ation identifier$ ,y default& the o!ner is the only person !ho may no! of the existence of the ob'ect and perform any operations on the ob'ect$ ,rivile"es are the actions that a user is permitted to carry out on a given base table or vie!$ %or example& SELECT is the privilege to retrieve data from a table and UPDATE is the privilege to modify records of a table$ ?hen a user creates a table using the S*+ CREATE TABLE statement& he or she automatically becomes the o!ner of the table and receives full privileges for the table$ >ther users initially have no privileges on the ne!ly created table$ .o give them access to the table& the o!ner must explicitly grant them the necessary privileges using the S*+ GRANT statement$ A WITH GRANT OPTION clause can be specified !ith the GRANT statement to allo! the receiving user(s) to pass the privilege(s) on to other users$ 1rivileges can be revo ed using the S*+ REVOKE statement$

@7

Database Solutions (2nd Edition)


?hen a user creates a vie! !ith the CREATE VIEW statement& he or she automatically becomes the o!ner of the vie!& but does not necessarily receive full privileges on the vie!$ .o create the vie!& a user must have SELECT privilege to all the tables that ma e up the vie!$ 3o!ever& the o!ner !ill only get other privileges if he or she holds those privileges for every table in the vie!$ 1'.& !escribe the security features of )icrosoft ?ccess BB .

Access provides a number of security features including the follo!ing t!o methods: (a) setting a pass!ord for opening a database (system security); (b) user/level security& !hich can be used to limit the parts of the database that a user can read or update (data security)$ 9n addition to the above t!o methods of securing a -icrosoft Access database& other security features include:

Encryption(decryption: encrypting a database compacts a database file and ma es it


indecipherable by a utility program or !ord processor$ .his is useful if you !ish to transmit a database electronically or !hen you store it on a floppy dis or compact disc$ Decrypting a database reverses the encryption$

1reventing users from replicating a database& setting pass!ords& or setting startup options ; Securing 0,A code: this can be achieved by setting a pass!ord that you enter once per
session or by saving the database as an -DE file& !hich compiles the 0,A source code before removing it from the database$ Saving the database as an -DE file also prevents users from modifying forms and reports !ithout re)uiring them to specify a log on pass!ord or !ithout you having to set up user/level security$

@8

Database Solutions (2nd Edition)

Chapter 1* ,hysical !atabase !esi"n E Step . E Review questions


1*.1 !escribe the purpose of Step . in the database desi"n methodolo"y.

Step F considers relaxing the normali#ation constraints imposed on the logical data model to improve the overall performance of the system$ 1*. 6xplain the meanin" of denormali>ation.

%ormally& the term denormali>ation refers to a change to the structure of a base table& such that the ne! table is in a lo!er normal form than the original table$ 3o!ever& !e also use the term more loosely to refer to situations !here !e combine t!o tables into one ne! table& !here the ne! table is in the same normal form but contains more nulls than the original tables$ 1*.& !iscuss when it may be appropriate to denormali>e a table. 8ive examples to illustrate your answer. .here are no fixed rules for determining !hen to denormali#e tables$ Some of the more common situations for considering denormali#ation to speed up fre)uent or critical transactions are: Step A$2$2 4ombining one/to/one (2:2) relationships Step A$2$2 Duplicating non ey columns in one/to/many (2:D) relationships to reduce 'oins Step A$2$5 Duplicating foreign ey columns in one/to/many (2:D) relationships to reduce 'oins Step A$2$7 Duplicating columns in many/to/many (D:D) relationships to reduce 'oins Step A$2$8 9ntroducing repeating groups Step A$2$@ 4reating extract tables Step A$2$A 1artitioning tables !escribe the two main approaches to partitionin" and discuss when each may be an appropriate way to improve performance. 8ive examples to illustrate your answer. 7ori>ontal partitionin" Distributing the records of a table across a number of (smaller) tables$ @ertical partitionin" Distributing the columns of a table across a number of (smaller) tables (the primary ey is duplicated to allo! the original table to be reconstructed)$ 1artitions are particularly useful in applications that store and analy#e large amounts of data$ %or example& let"s suppose there are hundreds of thousands of records in the 0ideo%or<ent

1*.'

@@

Database Solutions (2nd Edition)


table that are held indefinitely for analysis purposes$ Searching for a particular record at a branch could be )uite time consuming& ho!ever& !e could reduce this time by hori#ontally partitioning the table& !ith one partition for each branch$ .here may also be circumstances !here !e fre)uently examine particular columns of a very large table and it may be appropriate to vertically partition the table into those columns that are fre)uently accessed together and another vertical partition for the remaining columns (!ith the primary ey replicated in each partition to allo! the original table to be reconstructed)$

@A

Database Solutions (2nd Edition)

Chapter 1- ,hysical !atabase !esi"n E Step 5 E Review questions


1-.1 !escribe the purpose of the main steps in the physical desi"n methodolo"y presented in this chapter. Step : monitors the database application systems and improves performance by ma ing amendments to the design as appropriate$ 1-. 9hat factors can be used to measure efficiency3

.here are a number of factors that !e may use to measure efficiency:

.ransaction throughput: this is the number of transactions processed in a given time


interval$ 9n some systems& such as airline reservations& high transaction throughput is critical to the overall success of the system$

<esponse time: this is the elapsed time for the completion of a single transaction$ %rom a
user"s point of vie!& you !ant to minimi#e response time as much as possible$ 3o!ever& there are some factors that influence response time that you may have no control over& such as system loading or communication times$ Eou can shorten response time by: / / / reducing contention and !ait times& particularly dis 9(> !ait times; reducing the amount of time resources are re)uired; using faster components$

Dis storage: this is the amount of dis space re)uired to store the database files$ Eou
may !ish to minimi#e the amount of dis storage used$

1-.&

!iscuss how the four basic hardware components interact and affect system performance.

main memory 41= dis 9(> net!or $

@F

Database Solutions (2nd Edition)


Each of these resources may affect other system resources$ E)ually !ell& an improvement in one resource may effect an improvement in other system resources$ %or example: Adding more main memory should result in less paging$ .his should help avoid 41= bottlenec s$ -ore effective use of main memory may result in less dis 9(>$ 7ow should you distribute data across dis2s3

1-.'

%igure 2@$2 illustrates the basic principles of distributing the data across dis s: .he operating system files should be separated from the database files$ .he main database files should be separated from the index files$ .he recovery log file& if available and if used& should be separated from the rest of the database$

=i"ure 1-.1 +ypical dis2 confi"uration.

1-.*

9hat is R?I! technolo"y and how does it improve performance and reliability3

<A9D originally stood for <edundant Array of 9nexpensive Dis s& but more recently the 69" in <A9D has come to stand for 9ndependent$ <A9D !or s on having a large dis array comprising an arrangement of several independent dis s that are organi#ed to increase performance and at the same time improve reliability$ 1erformance is increased through data striping: the data is segmented into e)ual/si#e partitions (the striping unit)& !hich are transparently distributed across multiple dis s$ .his gives the appearance of a single large& very fast dis !here in actual fact the data is

distributed across several smaller dis s$ Striping improves overall 9(> performance by allo!ing multiple 9(>s to be serviced in parallel$ At the same time& data striping also balances the load among dis s$ <eliability is improved through storing redundant information across the dis s

@:

Database Solutions (2nd Edition)


using a parity scheme or an error/correcting scheme$ 9n the event of a dis redundant information can be used to reconstruct the contents of the failed dis $ failure& the

A;

Database Solutions (2nd Edition)

Chapter 1A Current and 6mer"in" +rends E Review questions


1A.1 !iscuss the "eneral characteristics of advanced database applications. Design data is characteri#ed by a large number of types& each !ith a small number of instances$ 4onventional databases are typically the opposite$ Designs may be very large& perhaps consisting of millions of parts& often !ith many interdependent subsystem designs$ .he design is not static but evolves through time$ ?hen a design change occurs& its implications must be propagated through all design representations$ .he dynamic nature of design may mean that some actions cannot be foreseen at the beginning$ =pdates are far/reaching because of topological or functional relationships& tolerances& and so on$ >ne change is li ely to affect a large number of design ob'ects$ >ften& many design alternatives are being considered for each component& and the correct version for each part must be maintained$ .his involves some form of version control and configuration management$ .here may be hundreds of staff involved !ith the design& and they may !or in

parallel on multiple versions of a large design$ Even so& the end product must be consistent and coordinated$ .his is sometimes referred to as cooperative

engineering$
1A. !iscuss why the wea2nesses of the relational data model and relational !()Ss may ma2e them unsuitable for advanced database applications.

1oor representation of 6real !orld" entities


Bormali#ation generally leads to the creation of tables that do not correspond to entities in the 6real !orld"$ .he fragmentation of a 6real !orld" entity into many tables& !ith a physical representation that reflects this structure& is inefficient leading to many 'oins during )uery processing$

Semantic overloading

A2

Database Solutions (2nd Edition)


.he relational model has only one construct for representing data and relationships bet!een data& namely the table$ %or example& to represent a many/to/many (D:D) relationship bet!een t!o entities A and B& !e create three tables& one to represent each of the entities A and B& and one to represent the relationship$ .here is no mechanism to distinguish bet!een entities and relationships& or to distinguish bet!een different inds of relationship that exist bet!een

entities$ %or example& a 2:D relationship might be Has& Supervises& -anages& and so on$ 9f such distinctions could be made& then it might be possible to build the semantics into the operations$ 9t is said that the relational model is semantically overloaded$

1oor support for business rules


9n Section 2$5& !e introduced the concepts of entity and referential integrity& and in Section 2$2$2 !e introduced domains& !hich are also types of business rules$ =nfortunately& many commercial systems do not fully support these rules& and it"s necessary to build them into the applications$ .his& of course& is dangerous and can lead to duplication of effort and& !orse still& inconsistencies$ %urthermore& there is no support for other types of business rules in the relational model& !hich again means they have to be built into the D,-S or the application$

+imited operations
.he relational model has only a fixed set of operations& such as set and record/oriented operations& operations that are provided in the S*+ specification$ 3o!ever& S*+ currently does not allo! ne! operations to be specified$ Again& this is too restrictive to model the behavior of many 6real !orld" ob'ects$ %or example& a C9S application typically uses points& lines& line groups& and polygons& and needs operations for distance& intersection& and containment$

Difficulty handling recursive )ueries


Atomicity of data means that repeating groups are not allo!ed in the relational model$ As a result& it"s extremely difficult to handle recursive )ueries: that is& )ueries about relationships that a table has !ith itself (directly or indirectly)$ .o overcome this problem& S*+ can be embedded in a high/level programming language& !hich provides constructs to facilitate iteration$ Additionally& many <D,-Ss provide a report !riter !ith similar constructs$ 9n either case& it is the application rather than the inherent capabilities of the system that provides the re)uired functionality$

A2

Database Solutions (2nd Edition)

9mpedance mismatch
9n Section 5$2$2& !e noted that until the most recent version of the standard S*+ lac ed

computational completeness$ .o overcome this problem and to provide additional flexibility& the
S*+ standard provides embedded S*+ to help develop more complex database applications$ 3o!ever& this approach produces an impedance mismatch because !e are mixing different programming paradigms: (2) S*+ is a declarative language that handles ro!s of data& !hereas a high/level language such as 64" is a procedural language that can handle only one ro! of data at a time$ (2) S*+ and 5C+s use different models to represent data$ %or example& S*+ provides the built/in data types Date and 9nterval& !hich are not available in traditional programming languages$ .hus& it"s necessary for the application program to convert bet!een the t!o representations& !hich is inefficient& both in programming effort and in the use of runtime resources$ %urthermore& since !e are using t!o different type systems& it"s not possible to automatically type chec the application as a !hole$ .he latest release of the S*+ standard& S*+5& addresses some of the above deficiencies !ith the introduction of many ne! features& such as the ability to define ne! data types and operations as part of the data definition language& and the addition of ne! constructs to ma e the language computationally complete$ 1A.& 6xplain what is meant by a !!()S1 and discuss the motivation in providin" such a system. A !istributed !atabase )ana"ement System (!!()S) consists of a single logical database that is split into a number of fra"ments$ Each fragment is stored on one or more computers (replicas) under the control of a separate D,-S& !ith the computers connected by a communications net!or $ Each site is capable of independently processing user re)uests that re)uire access to local data (that is& each site has some degree of local autonomy) and is also capable of processing data stored on other computers in the net!or $ 1A.' Compare and contrast a !!()S with distributed processin". :nder what

circumstances would you choose a !!()S over distributed processin"3 !istributed processin": a centrali#ed database that can be accessed over a computer net!or $

A5

Database Solutions (2nd Edition)


.he ey point !ith the definition of a distributed D,-S is that the system consists of data that is physically distributed across a number of sites in the net!or $ 9f the data is centrali#ed& even though other users may be accessing the data over the net!or & !e do not consider this to be a distributed D,-S& simply distributed processing$ 1A.* !iscuss the advanta"es and disadvanta"es of a !!()S.

Advantages
Reflects or"ani>ational structure -any organi#ations are naturally distributed over several
locations$ 9t"s natural for databases used in such an application to be distributed over these locations$

Improved shareability and local autonomy .he geographical distribution of an organi#ation can
be reflected in the distribution of the data; users at one site can access data stored at other sites$ Data can be placed at the site close to the users !ho normally use that data$ 9n this !ay& users have local control of the data& and they can conse)uently establish and enforce local policies regarding the use of this data$

Improved availability 9n a centrali#ed D,-S& a computer failure terminates the operations of


the D,-S$ 3o!ever& a failure at one site of a DD,-S& or a failure of a communication lin ma ing some sites inaccessible& does not ma e the entire system inoperable$

Improved reliability As data may be replicated so that it exists at more than one site& the
failure of a node or a communication lin does not necessarily ma e the data inaccessible$

Improved performance As the data is located near the site of 6greatest demand"& and given the
inherent parallelism of DD,-Ss& it may be possible to improve the speed of database accesses than if !e had a remote centrali#ed database$ %urthermore& since each site handles only a part of the entire database& there may not be the same contention for 41= and 9(> services as characteri#ed by a centrali#ed D,-S$

6conomics 9t"s generally accepted that it costs much less to create a system of smaller
computers !ith the e)uivalent po!er of a single large computer$ .his ma es it more cost/ effective for corporate divisions and departments to obtain separate computers$ 9t"s also much more cost/effective to add !or stations to a net!or than to update a mainframe system$

A7

Database Solutions (2nd Edition)


)odular "rowth 9n a distributed environment& it"s much easier to handle expansion$ Be! sites
can be added to the net!or !ithout affecting the operations of other sites$ .his flexibility allo!s an organi#ation to expand relatively easily$

Disadvantages
Complexity A DD,-S that hides the distributed nature from the user and provides an
acceptable level of performance& reliability& and availability is inherently more complex than a centrali#ed D,-S$ <eplication also adds an extra level of complexity& !hich if not handled ade)uately& !ill lead to degradation in availability& reliability& and performance compared !ith the centrali#ed system& and the advantages !e cited above !ill become disadvantages$

Cost 9ncreased complexity means that !e can expect the procurement and maintenance costs
for a DD,-S to be higher than those for a centrali#ed D,-S$ %urthermore& a DD,-S re)uires additional hard!are to establish a net!or bet!een sites$ .here are ongoing communication

costs incurred !ith the use of this net!or $ .here are also additional manpo!er costs to manage and maintain the local D,-Ss and the underlying net!or $

Security 9n a centrali#ed system& access to the data can be easily controlled$ 3o!ever& in a
DD,-S not only does access to replicated data have to be controlled in multiple locations& but the net!or itself has to be made secure$ 9n the past& net!or s !ere regarded as an insecure communication medium$ Although this is still partially true& significant developments have been made recently to ma e net!or s more secure$

Inte"rity control more difficult Enforcing integrity constraints generally re)uires access to a
large amount of data that defines the constraint& but is not involved in the actual update operation itself$ 9n a DD,-S& the communication and processing costs that are re)uired to enforce integrity constraints may be prohibitive$

Lac2 of standards Although DD,-Ss depend on effective communication& !e are only no!
starting to see the appearance of standard communication and data access protocols$ .his lac of standards has significantly limited the potential of DD,-Ss$ .here are also no tools or methodologies to help users convert a centrali#ed D,-S into a distributed D,-S$

A8

Database Solutions (2nd Edition)


Lac2 of experience Ceneral/purpose DD,-Ss have not been !idely accepted& although many of
the protocols and problems are !ell understood$ 4onse)uently& !e do not yet have the same level of experience in industry as !e have !ith centrali#ed D,-Ss$ %or a prospective adopter of this technology& this may be a significant deterrent$

!atabase desi"n more complex ,esides the normal difficulties of designing a centrali#ed
database& the design of a distributed database has to ta e account of fragmentation of data& allocation of fragments to specific sites& and data replication$

1A.-

!escribe the expected functionality of a replication server.

At its basic level& !e expect a distributed data replication service to be capable of copying data from one database to another& synchronously or asynchronously$ 3o!ever& there are many other functions that need to be provided& such as: Q

Specification of replication schema .he system should provide a mechanism to allo! a


privileged user to specify the data and ob'ects to be replicated$

Subscription mechanism .he system should provide a mechanism to allo! a privileged


user to subscribe to the data and ob'ects available for replication$

9nitiali#ation mechanism

.he system should provide a mechanism to allo! for the

initiali#ation of a target replica$ Q

Scalability .he service should be able to handle the replication of both small and large
volumes of data$

-apping and transformation

.he service should be able to handle replication across

different D,-Ss and platforms$ .his may involve mapping and transforming the data from one data model into a different data model& or the data in one data type to a corresponding data type in another D,-S$ Q

>b'ect replication

9t should be possible to replicate ob'ects other than data$ %or

example& some systems allo! indexes and stored procedures (or triggers) to be replicated$ Q

Easy administration 9t should be easy for the D,A to administer the system and to
chec the status and monitor the performance of the replication system components$

A@

Database Solutions (2nd Edition)


1A.. Compare and contrast the different ownership models for replication. 8ive examples to illustrate your answer. >!nership relates to !hich site has the privilege to update the data$ .he main types of o!nership are masterFslave& wor2flow& and update-anywhere (sometimes referred to as peer/

to/peer or symmetric replication)$


)asterFslave ownership ?ith master(slave o!nership& asynchronously replicated data is o!ned by one site& the master or primary site& and can be updated by only that site$ =sing a 6publish/and/subscribe" metaphor& the master site (the publisher) ma es data available$ >ther sites 6subscribe" to the data o!ned by the master site& !hich means that they receive read/only copies on their local systems$ 1otentially& each site can be the master site for non/overlapping data sets$ 3o!ever& there can only ever be one site that can update the master copy of a particular data set& and so update conflicts cannot occur bet!een sites$ A master site may o!n the data in an entire table& in !hich case other sites subscribe to read/only copies of that table$ Alternatively& multiple sites may o!n distinct fragments of the table& and other sites then subscribe to read/only copies of the fragments$ .his type of replication is also no!n as asymmetric replication$ 9or2flow ownership +i e master(slave o!nership& this model avoids update conflicts !hile at the same time providing a more dynamic o!nership model$ ?or flo! o!nership allo!s the right to update replicated data to move from site to site$ 3o!ever& at any one moment& there is only ever one site that may update that particular data set$ A typical example of !or flo! o!nership is an order processing system& !here the processing of orders follo!s a series of steps& such as order entry& credit approval& invoicing& shipping& and so on$ 9n a centrali#ed D,-S& applications of this nature access and update the data in one integrated database: each application updates the order data in se)uence !hen& and only !hen& the state of the order indicates that the previous step has been completed$ :pdate-anywhere $symmetric replication% ownership .he t!o previous models share a common property: at any given moment& only one site may update the data; all other sites have read/only access to the replicas$ 9n some environments& this is too restrictive$ .he update/any!here model creates a peer/to/peer environment !here

AA

Database Solutions (2nd Edition)


multiple sites have e)ual rights to update replicated data$ .his allo!s local sites to function autonomously& even !hen other sites are not available$ Shared o!nership can lead to conflict scenarios and the replication architecture has to be able to employ a methodology for conflict detection and resolution$ A simple mechanism to detect conflict !ithin a single table is for the source site to send both the old and ne! values (before/ and after/images) for any records that have been updated since the last refresh$ At the target site& the replication server can chec each record in the target database that has also been updated against these values$ 3o!ever& consideration has to be given to detecting other types of conflict such as violation of referential integrity bet!een t!o tables$ .here have been many mechanisms proposed for conflict resolution& but some of the most common are: earliest(latest timestamps& site priority& and holding for manual resolution$ 1A.5 8ive a definition of an //!()S. 9hat are the advanta"es and disadvanta"es of an //!()S. //!) A (logical) data model that captures the semantics of ob'ects supported in ob'ect/ oriented programming$ //!( A persistent and sharable collection of ob'ects defined by an >>D-$ //!()S .he manager of an >>D,$ 1A.A 8ive a definition of an /R!()S. 9hat are the advanta"es and disadvanta"es of an /R!()S. .hus& there is no single extended relational model; rather& there are a variety of these models& !hose characteristics depend upon the !ay and the degree to !hich extensions !ere made$ 3o!ever& all the models do share the same basic relational tables and )uery language& all incorporate some concept of 6ob'ect"& and some have the ability to store methods (or procedures or triggers) as !ell as data in the database$ 1A.1B 8ive a definition of a data warehouse. !iscuss the benefits of implementin" a data warehouse. !ata warehouse : a consolidated(integrated vie! of corporate data dra!n from disparate

operational data sources and a range of end/user access tools capable of supporting simple to highly complex )ueries to support decision/ma ing$ (enefits#

AF

Database Solutions (2nd Edition)


1otential high return on investment 4ompetitive advantage 9ncreased productivity of corporate decision/ma ers

1A.11 !escribe the characteristics of the data held in a data warehouse. .he data held in a data !arehouse is described as being sub'ect/oriented& integrated& time/ variant& and non/volatile (9nmon& 2::5)$

Sub'ect/oriented as the !arehouse is organi#ed around the ma'or sub'ects of the


organi#ation (such as customers& products& and sales) rather than the ma'or application areas (such as customer invoicing& stoc control& and product sales)$ .his is reflected in the need to store decision/support data rather than application/oriented data$

9ntegrated because of the coming together of source data from different organi#ation/
!ide applications systems$ .he source data is often inconsistent using for example& different data types and(or formats$ .he integrated data source must be made consistent to present a unified vie! of the data to the users$

.ime/variant because data in the !arehouse is only accurate and valid at some point in
time or over some time interval$ .he time/variance of the data !arehouse is also sho!n in the extended time that the data is held& the implicit or explicit association of time !ith all data& and the fact that the data represents a series of snapshots$

Bon/volatile as the data is not updated in real/time but is refreshed from operational
systems on a regular basis$ Be! data is al!ays added as a supplement to the database& rather than a replacement$ .he database continually absorbs this ne! data& incrementally integrating it !ith the previous data$

1A.1

!iscuss how data marts differ from data warehouses and identify the main reasons for implementin" a data mart.

A data mart holds a subset of the data in a data !arehouse normally in the form of summary data relating to a particular department or business area such as -ar eting or 4ustomer Services$ .he data mart can be stand/alone or lin ed centrally to the corporate data !arehouse$ As a data !arehouse gro!s larger& the ability to serve the various needs of the organi#ation may

A:

Database Solutions (2nd Edition)


be compromised$ .he popularity of data marts stems from the fact that corporate data !arehouses proved difficult to build and use$ 1A.1& !iscuss what online analytical processin" $/L?,% is and how /L?, differs from data warehousin". /nline analytical processin" $/L?,%# .he dynamic synthesis& analysis& and consolidation of large volumes of multi/dimensional data$ .he ey characteristics of >+A1 applications include multi/ dimensional vie!s of data& support for complex calculations& and time intelligence$ 1A.1' !escribe /L?, applications and identify the characteristics of such applications.

An essential re)uirement of all >+A1 applications is the ability to provide users !ith 'ust/in/ time (I9.) information& !hich is necessary to ma e effective decisions about an organi#ationRs strategic directions$ 1A.1* !iscuss how data minin" can reali>e the value of a data warehouse. Simply storing information in a data !arehouse does not provide the benefits an organi#ation is see ing$ .o reali#e the value of a data !arehouse& it"s necessary to extract the no!ledge

hidden !ithin the !arehouse$ 3o!ever& as the amount and complexity of the data in a data !arehouse gro!s& it becomes increasingly difficult& if not impossible& for business analysts to identify trends and relationships in the data using simple )uery and reporting tools$ Data mining is one of the best !ays to extract meaningful trends and patterns from huge amounts of data$ Data mining discovers information !ithin data !arehouses that )ueries and reports cannot effectively reveal$

F;

Database Solutions (2nd Edition)


1A.1- 9hy would we want to dynamically "enerate web pa"es from data held in the operational database3 List some "eneral requirements for web-database inte"ration. An 3.-+(S-+ document stored in a file is an example of a static ?eb page: the content of the document does not change unless the file itself is changed$ >n the other hand& the content of a dynamic ?eb page is generated each time it"s accessed$ As a result& a dynamic ?eb page can have features that are not found in static pages& such as:

Q Q

9t can respond to user input from the bro!ser$ %or example& returning data re)uested by the completion of a form or the results of a database )uery$ 9t can be customi#ed by and for each user$ %or example& once a user has specified some preferences !hen accessing a particular site or page (such as area of interest or level of expertise)& this information can be retained and information returned appropriate to these preferences$

Bot in any ran ed order& the re)uirements are as follo!s: Q Q .he ability to access valuable corporate data in a secure manner$ Data and vendor independent connectivity to allo! freedom of choice in the selection of the D,-S no! and in the future$ Q .he ability to interface to the database independent of any proprietary ?eb bro!ser or ?eb server$ Q A connectivity solution that ta es advantage of all the features of an organi#ation"s D,-S$ Q An open/architecture approach to allo! interoperability !ith a variety of systems and technologies$ Q A cost/effective solution that allo!s for scalability& gro!th& and changes in strategic directions& and helps reduce the costs of developing and maintaining applications$ Q Q Q Q Q Support for transactions that span multiple 3..1 re)uests$ Support for session/ and application/based authentication$ Acceptable performance$ -inimal administration overhead$ A set of high/level productivity tools to allo! applications to be developed& maintained& and deployed !ith relative ease and speed$

F2

Database Solutions (2nd Edition)


1A.1. 9hat is G)L and discuss the approaches for mana"in" G)L-based data. G)L# a meta/language (a language for describing other languages) that enables designers to create their o!n customi#ed tags to provide functionality not available !ith 3.-+$ 9t"s anticipated that there !ill be t!o main models that !ill exist: data/centric and document/centric$ 9n a data-centric model& S-+ is used as the storage and interchange format for data that is structured& appears in a regular order& and is most li ely to be machine processed instead of read by a human$ 9n a data/centric model& the fact that the data is stored and transferred as S-+ is incidental and other formats could also have been used$ 9n this case& the data could be stored in a relational& ob'ect/relational& or ob'ect/oriented D,-S$ %or example& >racle has completely integrated S-+ into its >racle :i system$ S-+ can be stored as entire documents using the data types S-+.ype or 4+>,(,+>, (4haracter(,inary +arge >b'ect) or can be decomposed into its constituent elements and stored that !ay$ .he >racle )uery language has also been extended to permit searching of S-+/based content$ 9n a document-centric model& the documents are designed for human consumption (for example& boo s& ne!spapers& and email)$ Due to the nature of this information& much of the data !ill be irregular or incomplete& and its structure may change rapidly or unpredictably$ =nfortunately& relational& ob'ect/relational& and ob'ect/oriented D,-Ss do not handle data of this nature particularly !ell$ 4ontent management systems are an important tool for handling these types of documents$ =nderlying such a system& you may no! find a native S-+ database:

F2

Você também pode gostar