Você está na página 1de 49

DATABASE MANAGEMENT

SYSTEM
File System
◦ File system was an early attempt to computerize the manual filing system.

◦ A file system is a method for storing and organizing computer files and
the data to make it easy to find and access.

◦ File Systems may use a storage device such as a hard disk or CD‐ROM.
Characteristics
• It is group of files storing data of an organization.

• Each file is independent from one another.

• Each file contained and processed information for one specific function
like accounting or inventory.

• Files are designed by using application programs written in programming


languages such as COBOL, C, C++, etc….
A File based system
Drawbacks of File based System
• Data redundancy and inconsistency
• Difficulty in accessing data
• Data isolation
• Integrity problems
• Concurrent access anomalies

◦ Data Security
◦ Program-data dependence: changes in
program requires changes to data accessed
by program
Data Redundancy
• Often the same information is duplicated in two or more files. It may lead
to inconsistency

• Assume the same data is repeated in two or more files. If change is made
to data in one file, it is required that change be made to the data in the
other file as well.

• If this is not done, it will lead to multiple different values for same data
field.
Difficulty in Accessing Data
◦ Assume in a banking system there is need to find out the names of all
customers who live within a particular postal‐code area.

◦ But there is only a program to generate the list of all customers.


Data Isolation
◦ Data isolation means that all the related data is not available in one file.

◦ Generally, the data is scattered in various files, and the files may be in
different formats, therefore writing new application programs to retrieve the
appropriate data is difficult.
Integrity Problem
• Integrity constraints refers to the rules that
ensures completeness and reliability of data.
• Integrity constraints (e.g. account balance > 0)
become part of program code
• Hard to add new constraints or change existing
ones
Concurrent Access Anomalies
• Accessing the same data from the same file is called concurrent access.
• In the file system, concurrent access leads to incorrect data. For example,
a student wants to borrow a book from the library.

• If multiple users are updating the same data simultaneously it will result in
inconsistent data state.

• Example: Bank account A containing Rs. 6000/‐. If two transactions of


withdraw funds( Rs 500/‐ and Rs 1000/‐ respectively) from account at the
same time, result of the concurrent executions may leave the account in
an incorrect state.

• In file system, Program on the behalf of each withdrawal read the old
balance, reduce amount and write result back.
Data Security
• The data as maintained in flat files is easily accessible and therefore not
secure.

• Example : the Customer_Transaction file has details about the total


available balance of all customers. A customer wants information about
his/her account balance.

• In a file system it is difficult to give the customer access to only his/her


data in the file.

• Thus enforcing security constraints for entire file or for certain data items
are difficult.
Database Approach
• The limitations of File based system is overcome by a
Database system

• A database is a computer based record keeping


system whose over all purpose is to record and
maintain data.

• The database is a single, large repository of data, which


can be used simultaneously by many departments and
users.
DBMS
• Collection of interrelated data and a set of
programs to access the data
• A database management system (DBMS) is a
software package designed to define,
manipulate, retrieve and manage data in a
database.
• A DBMS generally manipulates the data itself, the
data format, field names, record structure and file
structure.
Database
◦ Serves many applications by centralizing data and
controlling redundant data
Database management system (DBMS)
◦ is software that permits an organization to centralize data,
manage them efficiently, and provide access to the
stored data by application programs.
◦ Interfaces between applications and physical data files
◦ Separates logical and physical views of data
Solves problems of traditional file environment
◦ Controls redundancy
◦ Eliminates inconsistency
◦ Disjoins programs and data
◦ Enables organization to centrally manage data and data security
Applications of DBMS
◦ Banking: all transactions
◦ Airlines: reservations, schedules
◦ Universities: registration, grades
◦ Sales: customers, products, purchases
◦ Manufacturing: production, inventory, orders,
supply chain
◦ Human resources: employee records, salaries,
tax deductions
◦ Databases touch all aspects of our lives
Advantages of DBMS
◦ Manage large bodies of information
◦ Defining structures for storage
◦ Provide mechanisms for manipulation
◦ Ensure the safety of the stored information
◦ Enable data to be shared
◦ Protect data from system crashes or
unauthorized access.
Examples of DBMS Packages
◦ Oracle
◦ SQL Server
◦ MS-Access
◦ MYSQL
Data Models
◦ A database model is a type of data model that determines the
logical structure of a database and fundamentally determines in
which manner data can be stored, organized and manipulated.
The most popular example of a database model is the
relational model, which uses a table-based format.

◦ A Database model defines the logical design of data. The model


describes the relationships between different parts of the data.
Historically, in database design, three models are commonly
used. They are,
1. Hierarchical Model
2. Network Model
3. Relational Model
Hierarchical Model

◦ In this model each entity has only one parent but can have several
children . At the top of hierarchy there is only one entity which is
called Root.
Network Model

In the network model, entities can be accessed through several path.


Relational Model

◦ In this model, data is organized in two-dimensional tables


called relations. The tables or relation are related to each other.
Relational DBMS
◦ Represent data as two-dimensional tables called
relations or files
◦ Each table contains data on entity and attributes
◦ Table: grid of columns and rows –
◦ Rows (tuples): Records for different entities
◦ Fields (columns): Represents attribute for entity
◦ Key field: Field used to uniquely identify each record
◦ Primary key: Field in table used for key fields
◦ Foreign key: Primary key used in second table as look-
up field to identify records from original table
Operations of a Relational
DBMS
◦ Three basic operations used to develop
useful sets of data
◦ SELECT: Creates subset of data of all records
that meet stated criteria
◦ JOIN: Combines relational tables to provide
user with more information than available in
individual tables
◦ PROJECT: Creates subset of columns in table,
creating tables with only the information
specified
The Object-Oriented Model
◦ Uses object-oriented approach to maintain records.
◦ Data stored in form of objects. An object consists of
both data and procedures to manipulate the data.
This is called encapsulation.
◦ Through encapsulation an object can be planted in
different datasets by replicating all or some of the
attributes of parent object. This is called inheritance.
◦ ODBMS can handled the objects like drawings, maps
and web pages.
◦ It can handle the data such as graphics, voice and
text more easily than relational model.
◦ ODBMS is useful for CAD, GIS and application
used to update thousands of web pages
daily.
◦ ODBMS provide GUI to interact with
database. Users choose object from classes
(group of objects that share similar
characteristics)
◦ Drawback of ODBMS is they store data with
application (dependence between
applications and data)
◦ Structured Query Language (SQL) is a standard computer language
for relational database management and data manipulation. SQL is
used to query, insert, update and modify data.
Schema, metadata
◦ Schema describes the structure of a database (name
and size of fields, relationships among different set of
records of files).
◦ Data dictionary is a repository of each table structures
and types of fields.
◦ Metadata (data about the data) includes source of
the data, tables that are related to the data, field
and index information, population rules: what is
inserted or updated and how often.
◦ Metadata help DBA to maintain the database and
understand the meaning of the fields and their
relationships.
Data Modeling
◦ Analyzing an organization’s data and identifying the relationships
among the data is called data modeling.
◦ Data modeling should be proactive and can be done before
data collection and it help decision makers to get a clear picture
what data need to collect and how data should organize to
generate report for decision making.
Entity-Relationship Diagram
◦ Effective data modeling and design of each
database involves the creation of a conceptual
blueprint of the database, this blueprint is called
entity relationship diagram (ERD).
◦ Used by database designers to document the
data model
◦ Illustrates relationships between entities
Type of relationships in relational
database under join operation:
◦ One-to-one: An entity in A is associated with at most one entity
in B, and an entity in B is associated with at most one entity in
A.

◦ One-to-many: An entity in A is associated with any number in B.


An entity in B is associated with at most one entity in A.

◦ Many-to-one: An entity in A is associated with at most one


entity in B. An entity in B is associated with any number in A.

◦ Many-to-many: Entities in A and B are associated with any


number from each other.
Databases on the Web
◦ Organizations linked their databases with the
internet.
◦ The interface of online database is designed with
web programming language like ASP, ASP.NET,
PHP (Hypertext Preprocessor) and Java servlets.
DATA WAREHOUSE
◦ Data warehouse are huge collections of historical transactions copied
from transactional databases, often along with other data from outside
sources.
◦ Managers use software tools like (OLAP, Business Intelligence systems) to
glean useful information from data warehouse to support their decision
making.
◦ Some data warehouse are made up of several data marts, each focusing
on an organizational unit or a subject.
◦ The three phases of DW are extraction, transforming and loading (ETL).
◦ In extraction phase, the builders create the files from transactional
databases and save them on the server that holds the data warehouse.
◦ In the transformation phase, specialists cleanse the data and modify it into
a form that allows insertion into the data warehouse for example spelling
check and fixing.
◦ Data cleansing – Software to detect and correct
data that are incorrect, incomplete, improperly
formatted, or redundant – Enforces consistency
among different sets of data from separate
information systems
◦ In loading phase, the specialist transfer the
transformed files to the data warehouse. They then
compare the data in the data warehouse with the
original data to confirm completeness.
◦ Metadata help to do so.
◦ Techniques like datamining, OLAP and BI used to
change the structure and contents of data
warehouse as per requirement.
Data marts

◦ Subset of data warehouse


◦ Summarized or highly focused portion of
firm’s data for use by specific population of
users
◦ Typically focuses on single subject or line of
business
◦ Ex. Bookseller Barnes & Noble used to
maintain a series of data marts—one for
point-of-sale data in retail stores, another for
college bookstore sales, and a third for online
sales
Online analytical processing
(OLAP) –
◦ Supports multidimensional data analysis
◦ Viewing data using multiple dimensions
◦ Each aspect of information (product, pricing, cost,
region, time period) is different dimension
◦ E.g., how many washers sold in the East in June
compared with other regions?
◦ OLAP enables rapid, online answers to ad hoc queries
◦ Examples are : IBM Cognos, SAP NetWeaver BW,
Microsoft Analysis Services, MicroStrategy Intelligence
Server, Essbase, icCube, Infor BI OLAP Server and
Jedox OLAP Server.
Data Mining
◦ Data mining is the process of uncovering patterns inside large
sets of structured data to predict future outcomes. Structured
data is data that is organized into columns and rows so that it
can be accessed and modified efficiently.
◦ Data mining is a process used by companies to turn raw data
into useful information. By using software to look for patterns in
large batches of data, businesses can learn more about their
customers and develop more effective marketing strategies
as well as increase sales and decrease costs. Data mining
depends on effective data collection and warehousing as
well as computer processing.
◦ Example XLminer in Excel used as datamining tool for
sentiment analysis on data based on online reviews.

Você também pode gostar