Escolar Documentos
Profissional Documentos
Cultura Documentos
Introduction to DBMS
DBMS Applications
View of Data
Data Abstraction
RDBMS concepts
Database languages
Keys in DBMS
Primary key
Super key
Candidate key
Alternate key
Composite key
Foreign key
Constraints in DBMS
Domain constraints
Mapping constraints
Cardinality in DBMS
Multivalued dependency
Transitive dependency
Normalization in dbms
ACID Properties
Deadlock
Concurrency Control
Telecom: There is a database to keeps track of the information regarding calls made,
network usage, customer details etc. Without the database systems it is hard to
maintain that huge amount of data that keeps updating every millisecond.
Banking System: For storing customer info, tracking day to day credit and debit
transactions, generating bank statements etc. All this work has been done with the help
of Database management systems.
Education sector: Database systems are frequently used in schools and colleges to
store and retrieve the data regarding student details, staff details, course details, exam
details, payroll data, attendance details, fees details etc. There is a hell lot amount of
inter-related data that needs to be stored and retrieved in an efficient manner.
Online shopping: You must be aware of the online shopping websites such as Amazon,
Flipkart etc. These sites store the product information, your addresses and preferences,
credit details and provide you the relevant list of products based on your query. All this
involves a Database management system.
I have mentioned very few applications; this list is never going to end if we start
mentioning all the DBMS applications.
Advantages of DBMS
Drawbacks of File system:
Data Isolation: Because data are scattered in various files, and files may be in
different formats, writing new application programs to retrieve the appropriate data is
difficult.
Disadvantages of DBMS:
Data abstraction
Instance and schema
Physical level: This is the lowest level of data abstraction. It describes how data is actually
stored in database. You can get the complex data structure details at this level.
Logical level: This is the middle level of 3-level data abstraction architecture. It describes
what data is stored in database.
View level: Highest level of data abstraction. This level describes the user interaction with
database system.
Example: Lets say we are storing customer information in a customer table. At physical
level these records can be described as blocks of storage (bytes, gigabytes, terabytes etc.)
in memory. These details are often hidden from the programmers.
At the logical level these records can be described as fields and attributes along with their
data types, their relationship among each other can be logically implemented. The
programmers generally work at this level because they are aware of such things about
database systems.
At view level, user just interact with system with the help of GUI and enter the details at
the screen, they are not aware of how the data is stored and what data is stored; such
details are hidden from them.
Design of database at view level is called view schema. This generally describes end user
interaction with database systems.
Relational Model
Hierarchical Model
Network Model Network Model is same as hierarchical model except that it has
graph-like structure rather than a tree-based structure. Unlike hierarchical model, this
model allows each record to have more than one parent record.
Physical Data Models These models describe data at the lowest level of abstraction.
Multivalued Attributes: An attribute that can hold multiple values is known as multivalued
attribute. We represent it with double ellipses in an E-R Diagram. E.g. A person can have
more than one phone numbers so the phone number attribute is multivalued.
Derived Attribute: A derived attribute is one whose value is dynamic and derived from
another attribute. It is represented by dashed ellipses in an E-R Diagram. E.g. Person age is
a derived attribute as it changes over time and can be derived from another attribute (Date
of birth).
Stu_Name
Ashish
Saurav
Lester
Lou
Stu_Age
23
22
24
26
Course_Id
C01
C02
C22
C39
Course_Name
Science
DBMS
Java
Computer Networks
Here Stu_Id, Stu_Name & Stu_Age are attributes of table Student and Stu_Id, Course_Id &
Course_Name are attributes of table Course. The rows with values are the records
(commonly known as tuples).
RDBMS Concepts
RDBMS stands for relational database management system. A relational database has following major
components: Table, Record / Tuple, Field & Column /Attribute.
Table:
A table is a collection of data represented in rows and columns. For e.g.
following table stores the information of students.
Student_Id
101
102
103
104
Student_Name
Chaitanya
Ajeet
Rahul
Shubham
Student_Addr
Dayal Bagh, Agra
Delhi
Gurgaon
Chennai
Student_Age
27
26
24
25
Records / Tuple:
Each row of a table is known as record or it is also known as tuple. For e.g. The below row is a record.
102
Ajeet
Delhi
26
Field:
The above table has four fields: Student_Id, Student_Name, Student_Addr & Student_Age.
Column / Attribute:
Each attribute and its values are known as attributes in a database. For e.g. Set of values of Student_Id field
is one of the four columns of the Student table.
Student_Id
101
102
103
104
student_age
12
13
12
student_name
Jon
Arya
Sansa
Stu_Name
Steve
Chaitanya
Ajeet
Stu_Age
29
27
28
Course Table:
Course_Id
C01
C21
C22
C33
Course_Name
Cobol
Java
Perl
JQuery
Stu_Id
123
367
367
234
DBMS languages
Database languages are used for read, update and store data in a database. There are
several such languages that can be used for this purpose; one of them is SQL (Structured
Query Language).
Types of DBMS languages:
Here DBName can be any string that would represent the database name.
Example The below statement would create a database named employee
SQL> CREATE DATABASE Employee;
In order to get the list of all the databases, you can use
Example
SQL> SHOW DATABASES;
+--------------------+
| Database
|
+--------------------+
| BeginnersBook
|
| AbcTemp
|
| Employee
|
| Customers
|
SHOW
DATABASES
statement.
| Student
|
| Faculty
|
| MyTest
|
| Demo
|
+--------------------+
8 rows in set (0.00 sec)
As you can see this statement listed all the databases. You can also find the Employee database
in the above list that we have created above using the CREATE DATABASE statement.
DROP DATABASE
statement is used for deleting a database and all of its tables completely.
Syntax:
DROP DATABASE DBName;
Here DBName is the name of the database which you want to delete.
Example The below statement would delete the database named Student.
SQL> DROP DATABASE Student;
Note: By deleting a database you delete all of its tables implicitly. For e.g. the above statement
would delete all the tables that are stored inside Student database, along with the database.
After dropping a database you can check the database list to cross verify that the database has
been successfully dropped or not. This is how you can do it.
Before deleting Student Database:
SQL> SHOW DATABASES;
+--------------------+
| Database
|
+--------------------+
| Abc
|
| Xyz
|
| Student
|
| Demo
|
| Test
|
+--------------------+
+--------------------+
| Database
|
+--------------------+
| Abc
|
| Xyz
|
| Demo
|
| Test
|
+--------------------+
Lets say we want to fetch column_a and column_x of table named ABC. The query for this should be:
SELECT column_a, column_x from ABC;
Example:
Lets say we have an EMPLOYEES table having below data.
+------+----------+---------+----------+
| EMP_NAME | EMP_AGE |EMP_SALARY|
+------+----------+---------+----------+
| 101 | Steve | 23
| 9000.00 |
| 223 | Peter | 24
| 2550.00 |
| 388 | Shubham | 19
| 2444.00 |
| 499 | Chaitanya| 29
| 6588.00 |
| 589 | Apoorv | 21
| 1400.00 |
| 689 | Rajat | 24
| 8900.00 |
| 700 | Ajeet | 20
| 18300.00 |
+------+----------+---------+----------+
|SSN
In order to fetch the SSN and EMP_NAME, we can write the SELECT Query like this:
SELECT SSN, EMP_NAME FROM EMPLOYEES;
+------+----------+
| 101 | Steve |
| 223 | Peter |
| 388 | Shubham |
| 499 | Chaitanya|
| 589 | Apoorv |
| 689 | Rajat |
| 700 | Ajeet |
+------+----------+
Similarly, you can fetch any particular column or group of columns using the SELECT Query in SQL.
To fetch the entire EMPLOYEES table:
SELECT * FROM EMPLOYEES;
Result:
+------+----------+---------+----------+
| EMP_NAME | EMP_AGE |EMP_SALARY|
+------+----------+---------+----------+
| 101 | Steve | 23
| 9000.00 |
| 223 | Peter | 24
| 2550.00 |
| 388 | Shubham | 19
| 2444.00 |
| 499 | Chaitanya| 29
| 6588.00 |
| 589 | Apoorv | 21
| 1400.00 |
| 689 | Rajat | 24
| 8900.00 |
| 700 | Ajeet | 20
| 18300.00 |
+------+----------+---------+----------+
|SSN
Syntax
UPDATE TableName
SET column_name1 = value, column_name2 = value....
WHERE condition;
Query would update only those rows that satisfy the condition defined in where clause.
Example
EMPLOYEES table:
+------+----------+---------+----------+
| EMP_NAME | EMP_AGE |EMP_SALARY|
+------+----------+---------+----------+
| 101 | Steve | 23
| 9000.00 |
| 223 | Peter | 24
| 2550.00 |
| 388 | Shubham | 19
| 2444.00 |
| 499 | Chaitanya| 29
| 6588.00 |
| 589 | Apoorv | 21
| 1400.00 |
| 689 | Rajat | 24
| 8900.00 |
| 700 | Ajeet | 20
| 18300.00 |
|SSN
+------+----------+---------+----------+
Update the salary of employees to 10000 if they are having age greater than 25.
SQL> UPDATE EMPLOYEES
SET EMP_SALARY = 10000
WHERE EMP_AGE > 25;
|SSN
As you can see that only one employee is there in table above the age of 25. The salary for the employee
got updated to 10000.
Output:
+------+----------+---------+----------+
| EMP_NAME | EMP_AGE |EMP_SALARY|
+------+----------+---------+----------+
| 101 | Steve | 23
| 9000.00 |
| 223 | Peter | 24
| 2550.00 |
| 388 | Shubham | 19
| 2444.00 |
| 499 | Chaitanya| 29
| 10000.00 |
| 589 | Apoorv | 21
| 12000.00 |
| 689 | Rajat | 24
| 8900.00 |
| 700 | Ajeet | 20
| 18300.00 |
+------+----------+---------+----------+
|SSN
Syntax
1) To Delete a particular set of rows:
DELETE FROM TableName
WHERE condition;
2) To Delete all the rows of a table:
DELETE FROM TableName;
Example:
|SSN
Delete all the records that have SSN greater than 400:
SQL> DELETE FROM EMPLOYEES
WHERE SSN > 400;
After the successful execution of above query the table would be having below mentioned records:
+------+----------+---------+----------+
| EMP_NAME | EMP_AGE |EMP_SALARY|
+------+----------+---------+----------+
| 101 | Steve | 23
| 9000.00 |
| 223 | Peter | 24
| 2550.00 |
| 388 | Shubham | 19
| 2444.00 |
+------+----------+---------+----------+
|SSN
Delete the data of employees having age greater than or equal to 24:
SQL> DELETE FROM EMPLOYEES
WHERE EMP_AGE >=24;
Result:
+------+----------+---------+----------+
| EMP_NAME | EMP_AGE |EMP_SALARY|
+------+----------+---------+----------+
| 101 | Steve | 23
| 9000.00 |
| 388 | Shubham | 19
| 2444.00 |
+------+----------+---------+----------+
|SSN
Key plays an important role in relational database; it is used for identifying unique rows
from table. It also establishes relationship among tables.
Primary Key A primary is a column or set of columns in a table that uniquely identifies tuples
(rows) in that table.
In the above Student table, the Stu_Id column uniquely identifies each row of the table.
Note:
The value of primary key should be unique for each row of the table. Primary key column cannot
contain duplicate values.
Primary keys are not necessarily to be a single column; more than one column can also be a primary
key for a table. For e.g. {Stu_Id, Stu_Name} collectively can play a role of primary key in the above table,
but that does not make sense because Stu_Id alone is enough to uniquely identifies rows in a table then
why to make things complex. Having that said, we should choose more than one columns as primary key
only when there is no single column that can play the role of primary key.
Super Key A super key is a set of one of more columns (attributes) to uniquely identify
rows in a table.
Emp_Numbe Emp_Nam
r
e
226
Steve
227
Ajeet
228
Chaitanya
229
Robert
Super keys:
{Emp_SSN}
{Emp_Number}
{Emp_SSN, Emp_Number}
{Emp_SSN, Emp_Name}
{Emp_SSN, Emp_Number, Emp_Name}
{Emp_Number, Emp_Name}
All of the above sets are able to uniquely identify rows of the employee table.
Candidate Keys:
As I stated above, they are the minimal super keys with no redundant attributes.
{Emp_SSN}
{Emp_Number}
Only these two sets are candidate keys as all other sets are having redundant attributes that are not
necessary for unique identification.
Primary key:
Primary key is being selected from the sets of candidate keys by database designer. So
Either {Emp_SSN} or {Emp_Number} can be the primary key.
Candidate Key A super key with no redundant attribute is known as candidate key.
Emp_Number
2264
2278
2288
2290
Emp_Name
Steve
Ajeet
Chaitanya
Robert
Alternate Key Out of all candidate keys, only one gets selected as primary key,
remaining keys are known as alternate or secondary keys.
Emp_Id
Composite Key A key that consists of more than one attribute to uniquely identify rows
(also known as records & tuples) in a table is called composite key.
order_Id
O001
O123
O123
O001
product_code
P007
P007
P230
P890
product_count
23
19
82
42
Foreign Key Foreign keys are the columns of a table that points to the primary key of
another table. They act as a cross-reference between tables.
Stu_Id
column in
Course_enrollment
Course_enrollment table:
Course_Id
C01
C02
C03
C05
C06
C07
Stu_Id
101
102
101
102
103
102
Student table:
Stu_Id Stu_Name Stu_Age
101
102
103
104
Chaitanya
Arya
Bran
Jon
22
26
25
21
Note: Practically, the foreign key has nothing to do with the primary key tag of another table, if it
points to a unique column (not necessarily a primary key) of another table then too, it would be a
foreign key. So, a correct definition of foreign key would be: Foreign keys are the columns of a
table that points to the candidate key of another table.
Constraints in DBMS
Constraints enforce limits to the data or type of data that can be inserted/updated/deleted
from a table. The whole purpose of constraints is to maintain the data integrity during an
update/delete/insert into a table.
Types of constraints
NOT NULL
UNIQUE
DEFAULT
CHECK
Key Constraints PRIMARY KEY, FOREIGN KEY
Domain constraints
Mapping constraints
NOT NULL:
NOT NULL constraint makes sure that a column does not hold NULL value. When we dont
provide value for a particular column while inserting a record into a table, it takes NULL
value by default. By specifying NULL constraint, we can be sure that a particular column(s)
cannot have NULL values.
Example:
CREATE TABLE STUDENT(
ROLL_NO INT NOT NULL,
STU_NAME VARCHAR (35) NOT NULL,
STU_AGE INT NOT NULL,
STU_ADDRESS VARCHAR (235),
PRIMARY KEY (ROLL_NO)
);
After this STU_ADDRESS column will not accept any null values
UNIQUE:
UNIQUE Constraint enforces a column or set of columns to have unique values. If a column
has a unique constraint, it means that particular column cannot have duplicate values in a
table.
CREATE TABLE STUDENT(
ROLL_NO INT NOT NULL,
STU_NAME VARCHAR (35) NOT NULL UNIQUE,
STU_AGE INT NOT NULL,
STU_ADDRESS VARCHAR (35) UNIQUE,
PRIMARY KEY (ROLL_NO)
);
Example:
Here we are setting up the UNIQUE Constraint for two columns: STU_NAME & STU_ADDRESS. which means
these two columns cannot have duplicate values.
Note: STU_NAME column has two constraints (NOT NULL and UNIQUE both) setup.
MySQL:
Syntax:
CREATE TABLE <table_name>
(
<column_name> <data_type>,
<column_name2> <data_type>,
....
....
UNIQUE(column_name)
);
Example:
Setting up constraint on STU_NAME column.
CREATE TABLE STUDENTS(
ROLL_NO INT
NOT NULL,
STU_NAME VARCHAR (35) NOT NULL,
STU_AGE INT
NOT NULL,
STU_ADDRESS VARCHAR (35),
UNIQUE(STU_NAME),
PRIMARY KEY (ROLL_NO)
);
Naming of UNIQUE Constraint:
Example:
ALTER TABLE STUDENTS
ADD UNIQUE (STU_NAME);
Syntax:
ALTER TABLE <table_name>
ADD CONSTRAINT <constraint_name> UNIQUE (<column_name1>, <column_name2>,...);
Example:
ALTER TABLE STUDENTS
ADD CONSTRAINT stu_Info UNIQUE (STU_NAME,STU_ADDRESS);
Example:
ALTER TABLE STUDENTS
DROP INDEX stu_Info
Example:
ALTER TABLE STUDENTS
DROP CONSTRAINT stu_Info;
DEFAULT:
The DEFAULT constraint provides a default value to a column when there is no value
provided while inserting a record into a table.
CREATE TABLE STUDENT(
ROLL_NO INT NOT NULL,
STU_NAME VARCHAR (35) NOT NULL,
STU_AGE INT NOT NULL,
EXAM_FEE INT DEFAULT 10000,
STU_ADDRESS VARCHAR (35) ,
PRIMARY KEY (ROLL_NO)
);
Example:
ALTER TABLE STUDENTS
MODIFY EXAM_FEE INT DEFAULT 10000;
Syntax:
ALTER TABLE <table_name>
ALTER COLUMN <column_name> DROP DEFAULT;
Example:
Lets say we want to drop the constraint from STUDENTS table, which we have created in the above
sections. We can do it like this.
CHECK:
This constraint is used for specifying range of values for a particular column of a table.
When this constraint is being set on a column, it ensures that the specified column must
have the value falling in the specified range.
CREATE TABLE STUDENT(
ROLL_NO INT NOT NULL CHECK(ROLL_NO >1000) ,
STU_NAME VARCHAR (35) NOT NULL,
STU_AGE INT NOT NULL,
EXAM_FEE INT DEFAULT 10000,
STU_ADDRESS VARCHAR (35) ,
PRIMARY KEY (ROLL_NO)
);
In the above example we have set the check constraint on ROLL_NO column of STUDENT
table. Now, the ROLL_NO field must have the value greater than 1000.
Key constraints:
PRIMARY KEY:
Primary key uniquely identifies each record in a table. It must have unique values and
cannot contain nulls. In the below example the ROLL_NO field is marked as primary key,
that means the ROLL_NO field cannot have duplicate and null values.
CREATE TABLE STUDENT(
ROLL_NO INT NOT NULL,
STU_NAME VARCHAR (35) NOT NULL UNIQUE,
Definition: A primary is a column or set of columns in a table that uniquely identifies tuples (rows) in that
table.
Example:
Student Table
Stu_Id Stu_Name Stu_Age
101
Steve
23
102
John
24
103
Robert
28
104
Carl
22
In the above Student table, the Stu_Id column uniquely identifies each row of the table.
Note:
The value of primary key should be unique for each row of the table. Primary key column cannot
contain duplicate values.
Primary keys are not necessarily to be a single column; more than one column can also be a primary
key for a table. For e.g. {Stu_Id, Stu_Name} collectively can play a role of primary key in the above table,
but that does not make sense because Stu_Id alone is enough to uniquely identifies rows in a table then
why to make things complex. Having that said, we should choose more than one columns as primary key
only when there is no single column that can play the role of primary key.
How to choose a primary key?
There are two ways: Either to create a column and let database automatically have numbers in increasing
order for each row or choose a column yourself making sure that it does not contain duplicates and nulls.
For e.g. in the above Student table, The Stu_Name column cannot be a primary key as more than one people
can have same name, similarly the Stu_Age column cannot play a primary key role as more than one persons
can have same age.
FOREIGN KEY:
Foreign keys are the columns of a table that points to the primary key of another table.
They act as a cross-reference between tables.
Stu_I
d
101
102
101
102
C06
C07
103
102
Student table:
Stu_I Stu_Nam Stu_Ag
d
e
e
Chaitany
101
22
a
102
Arya
26
103
Bran
25
104
Jon
21
Note: Practically, the foreign key has nothing to do with the primary key tag of another table, if it points to
a unique column (not necessarily a primary key) of another table then too, it would be a foreign key. So, a
correct definition of foreign key would be: Foreign keys are the columns of a table that points to the
candidate key of another table.
Domain constraints:
Each table has certain set of columns and each column allows a same type of data, based
on its data type. The column does not accept values of any other data type.
Domain constraints are user defined data type and we can define them like this:
Domain Constraint = data type + Constraints (NOT NULL / UNIQUE / PRIMARY KEY / FOREIGN
KEY / CHECK / DEFAULT)
Another example:
I want to create a table bank_account with account_type field having value either checking or
saving:
create domain account_type char(12)
constraint acc_type_test
check(value in ("Checking", "Saving"));
Assuming, that a customer orders more than once, the above relation represents one to
many relation. Similarly, we can achieve other mapping constraints based on the requirements.
Cardinality in DBMS
In DBMS you may hear cardinality term at two different places and it has two different
meanings as well.
In Context of Data Models:
In terms of data modeling, cardinality refers to the relationship between two tables. They
can be of four types:
One to One A single row of table 1 associates with single row of table 2
One to Many A single row of table 1 associates with more than one rows of table 2
Many to One Many rows of table 1 associate with a single row of table 2
Many to Many Many rows of table 1 associate with many rows of table 2
In Context of Query Optimization:
In terms of query, the cardinality refers to the uniqueness of a column in a table. The
column with all unique values would be having the high cardinality and the column with all
duplicate values would be having the low cardinality. These cardinality scores help in query
optimization.
The attributes of a table is said to be dependent on each other when an attribute of a table
uniquely identifies another attribute of the same table.
For example: Suppose we have a student table with attributes: Stu_Id, Stu_Name, Stu_Age.
Here Stu_Id attribute uniquely identifies the Stu_Name attribute of student table because if
we know the student id we can tell the student name associated with it. This is known as
functional dependency and can be written as Stu_Id->Stu_Name or in words we can say
Stu_Name is functionally dependent on Stu_Id.
Formally:
If column A of a table uniquely identifies the column B of same table then it can
represented as A->B (Attribute B is functionally dependent on attribute A)
Also, Student_Id -> Student_Id & Student_Name -> Student_Name are trivial dependencies
too.
If a functional dependency X->Y holds true where Y is not a subset of X then this
dependency is called non trivial Functional dependency.
For example:
An employee table with three attributes: emp_id, emp_name, emp_address.
The following functional dependencies are non-trivial:
emp_id -> emp_name (emp_name is not a subset of emp_id)
emp_id -> emp_address (emp_address is not a subset of emp_id)
Multivalued dependency occurs when there are more than one independent multivalued
attributes in a table.
For example: Consider a bike manufacture company, which produces two colors (Black and
white) in each model every year.
bike_model
M1001
M1001
M2012
M2012
M2222
M2222
manuf_year
2007
2007
2008
2008
2009
2009
color
Black
Red
Black
Red
Black
Red
Here columns manuf_year and color are independent of each other and dependent on
bike_model. In this case these two columns are said to be multivalued dependent on
bike_model. These dependencies can be represented like this:
bike_model ->> manuf_year
bike_model ->> color
X->Y
Y->Z
Note: A transitive dependency can only occur in a relation of three of more attributes. This
dependency helps us normalizing the database in 3NF (3rd Normal Form).
Example: Lets take an example to understand it better:
Book
Game of Thrones
Harry Potter
Dying of the Light
Author
George R. R. Martin
J. K. Rowling
George R. R. Martin
Author_age
66
49
66
{Book} ->{Author} (if we know the book, we knows the author name)
{Author} does not ->{Book}
{Author} -> {Author_age}
Therefore, as per the rule of transitive dependency: {Book} -> {Author_age} should hold,
that makes sense because if we know the book name we can know the authors age.
Anomalies in DBMS
There are three types of anomalies that occur when the database is not normalized. These
are Insertion, update and deletion anomaly. Lets take an example to understand this.
Example: Suppose a manufacturing company stores the employee details in a table named
employee that has four attributes: emp_id for storing employees id, emp_name for storing
employees name, emp_address for storing employees address and emp_dept for storing
the department details in which the employee works. At some point of time the table looks
like this:
emp_id
101
101
123
166
166
emp_name
Rick
Rick
Maggie
Glenn
Glenn
emp_address
Delhi
Delhi
Agra
Chennai
Chennai
emp_dept
D001
D002
D890
D900
D004
The above table is not normalized. We will see the problems that we face when a table is
not normalized.
Update anomaly: In the above table we have two rows for employee Rick as he belongs to
two departments of the company. If we want to update the address of Rick then we have to
update the same in two rows or the data will become inconsistent. If somehow, the correct
address gets updated in one department but not in other then as per the database, Rick
would be having two different addresses, which is not correct and would lead to
inconsistent data.
Insert anomaly: Suppose a new employee joins the company, who is under training and
currently not assigned to any department then we would not be able to insert the data into
the table if emp_dept field doesnt allow nulls.
Delete anomaly: Suppose, if at a point of time the company closes the department D890
then deleting the rows that are having emp_dept as D890 would also delete the information
of employee Maggie since she is assigned only to this department.
To overcome these anomalies we need to normalize the data. In the next section we will
discuss about normalization.
Normalization
Here
emp_address
New Delhi
102
Jon
Kanpur
103
Ron
Chennai
104
Lester
Bangalore
emp_mobile
8912312390
8812121212
9900012222
7778881212
9990000123
8123450987
Two employees (Jon & Lester) are having two mobile numbers so the company stored them
in the same field as you can see in the table above.
This table is not in 1NF as the rule says each attribute of a table must have atomic
(single) values, the emp_mobile values for employees Jon & Lester violates that rule.
To make the table complies with 1NF we should have the data like this:
emp_id
101
102
102
103
104
104
emp_name
Herschel
Jon
Jon
Ron
Lester
Lester
emp_address
New Delhi
Kanpur
Kanpur
Chennai
Bangalore
Bangalore
emp_mobile
8912312390
8812121212
9900012222
7778881212
9990000123
8123450987
subject
Maths
Physics
Biology
Physics
Chemistr
y
teacher_ag
e
38
38
38
40
40
teacher_age
38
38
40
teacher_subject table:
teacher_id
111
111
222
333
333
subject
Maths
Physics
Biology
Physics
Chemistry
emp_name
John
Ajeet
Lora
Lilly
Steve
emp_zip
282005
222008
282007
292008
222999
employee_zip table:
emp_zip
emp_state
emp_city
emp_district
282005
222008
282007
292008
222999
UP
TN
TN
UK
MP
Agra
Chennai
Chennai
Pauri
Gwalior
Dayal Bagh
M-City
Urrapakkam
Bhagwan
Ratan
emp_nationality
Austrian
Austrian
American
American
emp_dept
Production and planning
stores
design and technical support
Purchasing department
dept_type
D001
D001
D134
D134
dept_no_of_emp
200
250
100
600
emp_nationality
Austrian
American
emp_dept table:
emp_dept
Production and planning
stores
design and technical support
Purchasing department
dept_type
D001
D001
D134
D134
emp_dept_mapping table:
emp_id
1001
1001
1002
1002
emp_dept
Production and planning
stores
design and technical support
Purchasing department
dept_no_of_emp
200
250
100
600
Functional dependencies:
emp_id -> emp_nationality
emp_dept -> {dept_type, dept_no_of_emp}
Candidate keys:
For first table: emp_id
For second table: emp_dept
For third table: {emp_id, emp_dept}
This is now in BCNF as in both the functional dependencies left side part is a key.
Atomicity: This property ensures that either all the operations of a transaction
reflect in database or none. Lets take an example of banking system to understand this:
Suppose Account A has a balance of 400$ & B has 700$. Account A is transferring 100$ to
Account B. This is a transaction that has two operations a) Debiting 100$ from As
balance b) Creating 100$ to Bs balance. Lets say first operation passed successfully
while second failed, in this case As balance would be 300$ while B would be having 700$
instead of 800$. This is unacceptable in a banking system. Either the transaction should
fail without executing any of the operation or it should process both the operations. The
Atomicity property ensures that.
Isolation: For every pair of transactions, one transaction should start execution only
when the other finished execution. I have already discussed the example of Isolation in
the Consistency property above.
Durability: Once a transaction completes successfully, the changes it has made into
the database should be permanent even if there is a system failure. The recoverymanagement component of database systems ensures the durability of transaction.
Deadlock in DBMS
A deadlock is a condition wherein two or more tasks are waiting for each other in order to be
finished but none of the task is willing to give up the resources that other task needs. In this
situation no task ever gets finished and is in waiting state forever.
Coffman conditions
Coffman stated four conditions for a deadlock occurrence. A deadlock may occur if all the following
conditions holds true.
Mutual exclusion condition: There must be at least one resource that cannot be used by
more than one process at a time.
Hold and wait condition: A process that is holding a resource can request for additional
resources that are being held by other processes in the system.
No preemption condition: A resource cannot be forcibly taken from a process. Only the
process can release a resource that is being held by it.
Circular wait condition: A condition where one process is waiting for a resource that is
being held by second process and second process is waiting for third process .so on and the
last process is waiting for the first process. Thus making a circular chain of waiting.
Deadlock Handling
Ignore the deadlock (Ostrich algorithm)
Did that made you laugh? You may be wondering how ignoring a deadlock can come under deadlock
handling. But to let you know that the windows you are using on your PC, uses this approach of
deadlock handling and that is reason sometimes it hangs up and you have to reboot it to get it
working. Not only Windows but UNIX also uses this approach.
The question is why? Why instead of dealing with a deadlock they ignore it and why this is
being called as Ostrich algorithm?
Well! Let me answer the second question first, This is known as Ostrich algorithm because in this
approach we ignore the deadlock and pretends that it would never occur, just like Ostrich behavior
to stick ones head in the sand and pretend there is no problem.
Lets discuss why we ignore it : When it is believed that deadlocks are very rare and cost of
deadlock handling is higher, in that case ignoring is better solution than handling it. For example:
Lets take the operating system example If the time requires handling the deadlock is higher than
the time requires rebooting the windows then rebooting would be a preferred choice considering
that deadlocks are very rare in windows.
Deadlock detection
Resource scheduler is one that keeps the track of resources allocated to and requested by
processes. Thus, if there is a deadlock it is known to the resource scheduler. This is how a deadlock
is detected.
Deadlock prevention
We have learnt that if all the four Coffman conditions hold true then a deadlock occurs so
preventing one or more of them could prevent the deadlock.
Removing mutual exclusion: All resources must be sharable that means at a time more than
one processes can get a hold of the resources. That approach is practically impossible.
Removing hold and wait condition: This can be removed if the process acquires all the
resources that are needed before starting out. Another way to remove this to enforce a rule of
requesting resource when there are none in held by the process.
Preemption of resources: Preemption of resources from a process can result in rollback and
thus this needs to be avoided in order to maintain the consistency and stability of the system.
Avoid circular wait condition: This can be avoided if the resources are maintained in a
hierarchy and process can hold the resources in increasing order of precedence. This avoid
circular wait. Another way of doing this to force one resource per process rule A process can
request for a resource once it releases the resource currently being held by it. This avoids the
circular wait.
Deadlock Avoidance
Deadlock can be avoided if resources are allocated in such a way that it avoids the
deadlock occurrence. There are two algorithms for deadlock avoidance.
Wait/Die
Wound/Wait
Here is the table representation of resource allocation for each algorithm. Both of these
algorithms take process age into consideration while determining the best possible way of
resource allocation for deadlock avoidance.
Wait/Die
Wound/Wait
Older process
waits
Younger
process dies
Younger
process dies
Younger
process waits
bigint
float
int
real
smallint
tinyint
bit
decimal
numeric
money
smallmoney
FROM
TO
datetime
Jan 1, 1753
smalldatetime
Jan 1, 1900
Jun 6, 2079
date
time
FROM
char
char
varchar
varchar
varchar(max)
varchar(max)
text
text
TO
Maximum length of 8,000 characters.( Fixed length nonUnicode characters)
Maximum of 8,000 characters.(Variable-length non-Unicode
data).
Maximum length of 231characters, Variable-length nonUnicode data (SQL Server 2005 only).
Variable-length non-Unicode data with a maximum length of
2,147,483,647 characters.
Description
nchar
nvarchar
nvarchar(max)
Maximum length of 231characters (SQL Server 2005 only).( Variable length Unicode)
ntext
Description
binary
varbinary
varbinary(max)
Maximum length of 231 bytes (SQL Server 2005 only). ( Variable length Binary data)
image
Description
Checks if the values of two operands are equal or not, if yes then condition becomes true.
!=
Checks if the values of two operands are equal or not, if values are not equal then condition becomes true.
<>
Checks if the values of two operands are equal or not, if values are not equal then condition becomes true.
>
Checks if the value of left operand is greater than the value of right operand, if yes then condition becomes true.
<
Checks if the value of left operand is less than the value of right operand, if yes then condition becomes true.
>=
Checks if the value of left operand is greater than or equal to the value of right operand, if yes then condition becomes true.
<=
Checks if the value of left operand is less than or equal to the value of right operand, if yes then condition becomes true.
!<
Checks if the value of left operand is not less than the value of right operand, if yes then condition becomes true.
!>
Checks if the value of left operand is not greater than the value of right operand, if yes then condition becomes true.
Description
ALL
The ALL operator is used to compare a value to all values in another value set.
AND
The AND operator allows the existence of multiple conditions in an SQL statement's WHERE clause.
ANY
The ANY operator is used to compare a value to any applicable value in the list according to the condition.
BETWEEN
The BETWEEN operator is used to search for values that are within a set of values, given the minimum value and the
maximum value.
EXISTS
The EXISTS operator is used to search for the presence of a row in a specified table that meets certain criteria.
IN
The IN operator is used to compare a value to a list of literal values that have been specified.
LIKE
The LIKE operator is used to compare a value to similar values using wildcard operators.
NOT
The NOT operator reverses the meaning of the logical operator with which it is used. Eg: NOT EXISTS, NOT BETWEEN,
NOT IN, etc. This is a negate operator.
OR
The OR operator is used to combine multiple conditions in an SQL statement's WHERE clause.
IS NULL
UNIQUE
The UNIQUE operator searches every row of a specified table for uniqueness (no duplicates).