Você está na página 1de 31

COURSE 1

Introduction in Databases
Database Management Systems 1 2
What is a Database?
SELECT name FROM Musicians M
INNER JOIN Plays P ON M.mno = P.mno
INNER JOIN Instruments I ON P.instr = I.instr
WHERE class = wind
GROUP BY mno, name, instr
HAVING count(instr) = ALL
(SELECT count(*) FROM Instruments WHERE class =wind)
3
What is a database?
Database = a very large, integrated
collection of related data items stored for
record-keeping and analysis, that exist over a
long period of time.
&
models the real-world activity through a
Data Model
4
Data Models
A data model is a collection of concepts for
describing data:
structures
relationships
semantics
consistency constraints.
A schema is a description of a particular
collection of data, using the given data
model.
5
Brief History of Data Models
Early Data Models (late 1960s) evolved from file-
based processing systems
Visualize the data much as it was stored:

Hierarchical Model
Tree-based view
Network Model
Graph-based view
6
Brief History of Data Models (cont)
Early 1970s Ted Codd
invented new data model
Relational Model
and the concept of
data abstraction



Soon thereafter, team of IBMers
invented SQL (Structured Query Language)
7
Brief History of Data Models (cont)
Object-Oriented Data Model (1967)


8
Schema vs. Data
Recall, Database schema describes the structure
of the database
Changed infrequently
a.k.a. database intension
a.k.a. metadata (= data about data)

Database state refers to the data in the database
at any given moment (snapshot)
Changes frequently
a.k.a. database extension
DBMS assures that all database states are valid states

Database Instance refers to a specific data item in
the database
9
Relation Example
Student information may be stored in a relation
with the following schema:
Students(sid:integer; name:string;email:string; age:integer; gr:integer)
field name
field type
sid name email age gr
2833 Jones jones@scs.ubbcluj.ro 19 931
2877 Smith smith@scs.ubbcluj.ro 20 932
2976 Jones jones@math.ubbcluj.ro 21 933
2765 Mary mary@math.ubbcluj.ro 22 933
relation
schema
relation
records
10
What is a DBMS?
A Database Management System (DBMS) is a
collection of programs that:
allows users to create a new database and specify its schema
gives users the ability to query and modify the data efficiently
keeps the data secure from accidents or unauthorized use
controls the access to the data for many users at once

The database and DBMS software together make
up what is known as the Database System.
11
Database Management Systems
Record-based data models:
Relational model (e.g. DB2, Informix, Oracle, MS
SQL Server, Microsoft Access, FoxBase, Paradox)
Hierarchical model (e.g. IBMs IMS DBMS)
Network model (used in IDMS)

Object-based data models:
Object-oriented model (e.g. Objectsore and Versant)
Object-relational model (e.g. Illustra, O2, UniSQL).
12
Types of DBMSs
General-purpose DBMS
Multimedia DBMS
Geographic Information Systems (GIS)
Data warehouse DBMS
Real-time DBMS
Active DBMS
Data Stream Management Systems
13
Some Recent Trends
DBMS are getting smaller and smaller
DBMS that can store GB of data can run on PC

Databases are getting bigger and bigger
Multiple TBs (terabyte = 10
12
bytes) not uncommon
Databases also able to store images, video, audio
Database stored on secondary storage devices

DBMS Supporting Parallel Computing
Speed-up query processing through parallelism (e.g., read data
from many disks)
However, need special algorithms to partition data correctly
14
When Use Databases?
When we have to deal with:
Persistence
Large Amounts of Data
Structured Data
Concurrent and Distributed Access
Integrity
Security
15
When NOT to Use Databases?
Initial investment too high
Too much overhead
Application is simple, well-defined, not expected
to change
Stringent real-time requirements (use specialized
real-time DBMS)
Multi-user access to data is not required

Alternative: collection of files managed by access
programs
16
Using Flat Files for Storing Data
Updates and deletions are expensive
Search is expensive (always have to read
the entire file)
Brute force query processing
No buffer management - application must
stage large datasets between main memory
and secondary storage (e.g., buffering, page-
oriented access, 32-bit addressing, etc.)
Special code for different queries
17
Using Flat Files for Storing Data (cont)
No concurrency control must protect data
from inconsistency due to multiple
concurrent users
No reliability (can lose data or live
operations half done)
No API or GUI
No security and access control
No crash recovery
18
Flat Files vs. Database Organization
a. Flat file is considered to be one-dimensional
storage system
b. Database refers to a multidimensional storage
19
HOW TO
manage large amounts of persistent, homogenous
and structured data that are shared among
distributed users and processes and whose integrity
must be maintained and whose security must be
controlled?

Answer: developing Database Applications
collection of data and programs that allow the
manipulation of these data.
A DB Application is best implemented using a DBMS.
20
Layered Approach to Database Implementation
Different levels
of abstraction
21
Levels of Abstraction

Views describe how users
see the data.

Conceptual schema
defines logical structure

Physical schema describes
the files and indexes used
Physical Schema
Conceptual Schema
View 1 View 2 View 3
Disk
Many external schemas (views), single conceptual
(logical) schema and physical (internal) schema.
22
Example: University Database
Conceptual schema:
Students(sid:string, name:string, email:string, age:integer, gr:integer)
Courses(cid: string, cname: string, credits:integer)
Enrolled(sid:string, cid:string, grade:integer)
Teachers(tid:integer; name: string; sal : integer)
Teaches(tid:integer; cid:integer)
Physical schema:
Relations stored as unordered files.
Index on first column of Students.
External Schema (View):
Course_info(cid:string, enrollment:integer)
23
Data Independence

One of the most important benefits of using
a DBMS!
Applications isolated from how data is
structured and stored.
Logical data independence: Protection from
changes in logical structure of data.
Physical data independence: Protection from
changes in physical structure of data.

24
Queries in a DBMS
For the sample University Database, here are
some questions that users may ask:
What is the name of the student with sid 2833?
What is the salary of the professor who teaches the
course with cid Alg100?
How many students are enrolled in course Alg100?
Such questions involving data stored in a DBMS
are called queries.
A DBMS provides a specialized language, called
the query language, in which queries can be posed.
25
DBMS Languages
Data Definition Language (DDL)
Used to define the conceptual and internal schemas
Includes constraint definition language (CDL) for describing
conditions that database instances must satisfy
Includes storage definition language (SDL) to influence layout of
physical schema (some DBMSs)
Data Manipulation Language (DML)
Used to describe operations on the instances of a database
Procedural DML (how) vs. declarative DML (what)

Note, SQL includes a DML and a DDL in one!
Host Language
General-purpose programming language which lets users embed
DML commands (data sublanguage) into their code
26
Query Languages for Relational DBs
SQL (Structured Query Language)
SELECT name FROM Students WHERE age > 20

Algebra

name
(
age > 20
(Students))

Domain Calculus
{<X>| V Y Z T : Students(V, X, Y, Z, T) Z>20}

T-uple Calculus
{X| Y : YStudents Y.age > 20 X.name =Y.name}
27
Actors
System Analyst
entity-relationship diagram
Database Designer
Designs logical /physical schemas
Application Programmer
Database Administrator
Handles security and authorization
Data availability, crash recovery
Database tuning as needs evolve
System Administrator
End Users (Naive / Sophisticated end users)
28
Course Content
Database Management Systems basic concepts
The Relational Database Model:
- Normal Forms
- Relational Database Query (Relational Algebra, SQL)
Database Design Models:
- Entity Relationship Model
- Object Oriented Model
- Object Relational Model
Physical Structure of Databases
Database Indexing
29
Laboratory


Working environments
DBMS: MS SQL Server
App. Development: .NET / C#
Data access: ADO.NET
30
Assessment / Other Details

Final grade
50% - laboratory activity / practical test
50% - written exam (session)

Details (bibliography, course slides,
seminars, lab descriptions etc)
http://www.cs.ubbcluj.ro/~tzutzu
31
TED Conferences
http://www.ted.com/


Hans Rosling - About Visualizing Data
(2006)

Você também pode gostar