Você está na página 1de 36

Aggregation Operations on Relations

P. M. Jat @ DAIICT

First, Issues with Null Values

8/26/2011

Database Systems

Issues with Null Values


An attribute having NULL could mean either of following Value is unknown or not available right now Value is not application for the tuple: a employee not having supervisor will have null in this attribute Arithmetic expressions (+,-,*,/) involving null values result null value for result

8/26/2011

Database Systems

Null Values and Comparisons


There are more issues with Null value when it appears in a attribute used in WHERE clause of SQL-SELECT What should ( r.a < 10 ) where r.a happens to be null When we compare a NULL value with another value including NULL, result is UNKNOWN. In relational operations UNKNOWN is another truthvalue like TRUE and FALSE
8/26/2011 Database Systems 4

Truth values for UNKOWN


NOT NOT UNKOWN -> UNKWON AND TRUE AND UNKOWN -> UNKOWN FALSE AND UNKOWN -> FALSE UNKWON AND UNKOWN -> UNKOWN OR TRUE OR UNKOWN -> TRUE FALSE OR UNKOWN -> UNKOWN UNKWON OR UNKOWN -> UNKOWN
8/26/2011 Database Systems 5

Null Values and Comparisons


While evaluating WHERE clause tuples with UNKOWN or FALSE truth values are not included in result Find out how presence of null in any of attribute will affect result of following predicates SELECT * FROM EMPLOYEE WHERE bdate < DATE '2001-01-01' AND salary > 30000 SELECT * FROM EMPLOYEE WHERE bdate < DATE 2001-01-01 OR salary > 30000

8/26/2011

Database Systems

Null Values and Comparisons IS NULL


Following will not give desired result. Why? SELECT * FROM employee WHERE superssn = NULL;

This is so because Null = Null is also UNKOWN, therefore SQL provides IS NULL (and IS NOT NULL) to check if a tuple has null value in an attribute We write as following for such situations
SELECT * FROM employee WHERE superssn IS NULL ;
8/26/2011 Database Systems 7

Now, Aggregation Operations

8/26/2011

Database Systems

Consider getting answer of following queries from employee relation of company schema
Find out total salary we pay Find out total number of employees, total number of supervisors Find out employee who are drawing maximum salary Find out employee who are drawing above average salary, Find out average salary paid to managers, And so forth These are basically aggregation operations
8/26/2011 Database Systems 9

Aggregation operations
Are basically counting or summing of an attributevalues over all tuples or grouped tupples of a relations Following are common aggregation operations: SUM, AVG, MAX, MIN, COUNT are common aggregate operations over specified column of a table

8/26/2011

Database Systems

10

Aggregation operationsLet us see some examples in SQL, first

SELECT count(*) FROM employee SELECT count(DISTINCT dno) FROM employee; SELECT min(salary), max(salary), avg(salary) FROM employee;

8/26/2011

Database Systems

11

Aggregation operations - COUNT


SELECT count(*) FROM employee returns count of tuples SELECT count(superssn) FROM employee returns count of tuples having superssn value (excludes tuples having null value in superssn) SELECT count(DISTINCT superssn) FROM employee; returns count of distinct values of superssn (excludes tuples having null value)
8/26/2011 Database Systems 12

Aggregation operations: sum, min, max, avg


Example Query: SELECT sum(salary), min(salary), max(salary), avg(salary) FROM employee; The query returns sum of salaries (of all salary values), min salary, average of salaries computed over all tuples of employee relation

8/26/2011

Database Systems

13

NULLs are ignored in Aggregation


NULL never contributes to a sum, average, or count, and can never be the minimum or maximum of a column. But if there are no non-NULL values in a column, then the result of the aggregation is NULL. Exception: COUNT of an empty set is 0.

8/26/2011

Database Systems

14

Example: Effect of NULLs


Observe the difference in result of following two queries SELECT COUNT( * ) FROM employee; SELECT COUNT( superssn ) FROM employee;

8/26/2011

Database Systems

15

Eliminating Duplicates in an Aggregation


Observe the difference in result of following two queries SELECT COUNT( superssn ) FROM employee; SELECT COUNT( DISTINCT superssn ) FROM employee;

8/26/2011

Database Systems

16

Aggregation over grouped tuples


Normally aggregate operations are not applied to full column, values in column are grouped based on some attribute(s) Following query requires grouping of tuples Find out department-wise count of employees Find out total salary we pay to each department

8/26/2011

Database Systems

17

Aggregation operations- Grouping


For example (Q71): find out average of all departments. Will be written as following in SQL (and shows results as given against)SELECT dno, avg(salary) AS dept_avg FROM employee GROUP BY dno;

8/26/2011

Database Systems

18

GROUP BY groups the tuples for every distinct value in grouping attribute(s). For example GROUP BY dno, will group all tuples of employee for each dno, and aggregation applied over that group, and finally you have aggregated values for each dno.

8/26/2011

Database Systems

19

Example Q72

Modify solution of Query 71 such that you get Department Name also, i.e. Q72: List DNo, DName, and Avg(Salary) of all departments

8/26/2011

Database Systems

20

Example Q72
List DNo, DName, and Avg(Salary) of all departmentsSELECT dno, dname, dept_avg FROM (SELECT dno, avg(salary) AS dept_avg FROM employee GROUP BY dno) AS avg_sal NATURAL JOIN department;

8/26/2011

Database Systems

21

Example Q72
List DNo, DName, and Avg(Salary) of all departments Following solution will also give correct result but is less intuitive solution
SELECT dno, dname, avg(salary) AS dept_avg FROM employee NATURAL JOIN department GROUP BY dno, dname;

8/26/2011

Database Systems

22

Aggregate operations in RA

Symbolically can be represented using script F (F) operator. General form could be<grouping-attributes> F<funtion-list>(R) Typically functions are: count, sum, max, min, avg

8/26/2011

Database Systems

23

Aggregate operations in RA
For example, a aggregate queryDNO

FDNO, COUNT(SSN) -> No_of_Emps,


AVG(SALARY) -> AVG_SAL (EMPLOYEE)

In SQL, you write, as SELECT dno, count(ssn) AS No_of_Emps, avg(salary) AS avg_sal FROM employee GROUP BY dno;

8/26/2011

Database Systems

24

Aggregation operations- GROUP BY


Normally aggregate functions appear in SELECT clause of query, however attributes that appear in GROUP BY can also (and only ?) appear here. Obviously following (in red) are wrongly placed attributes in statements below SELECT dno, ssn, avg(salary) AS avg_sal FROM employee GROUP BY dno; SELECT dno, avg(salary) AS avg_sal FROM employee;
8/26/2011 Database Systems 25

HAVING clause
HAVING is used to specify restrict over result of aggregation For example:
SELECT dno, avg(salary) AS avg_sal FROM employee GROUP BY dno HAVING avg(salary) > 50000;

Note that you can use avg_sal instead of avg(salary) in HAVING clause, because renaming is done at the time of projection
8/26/2011 Database Systems 26

Aggregate operations in SQL


Semantics of SELECT-GROUP BY-HAVING in sql select statement SELECT <attrib and/or function-list> FROM <table-list> [WHERE <condition>] [GROUP BY <grouping attribute(s)>] [HAVING <group-filter-condition>] [ORDER BY <attrib-list>];

8/26/2011

Database Systems

27

Semantics of SELECT-GROUP BY-HAVING


Result of FROM and WHERE is given to GROUP BY operation GROUP BY operation computes aggregated values for each group value, and gives you one tuple for each group value. group-filter-conditions in HAVING are used to apply restriction over result of GROUP BY operation Finally project is applied as defined in SELECT clause of SELECT statement.

8/26/2011

Database Systems

28

Example:
List employee-ssn along with count of employees they are supervising List employee-ssn along with count of employees they are supervising that are supervising more than 2 employees List employee-ssn, employee-fname along with count of employees they are supervising that are supervising more than 2 employees
8/26/2011 Database Systems 29

List employee-ssn along with count of employees they are supervising


SELECT superssn, COUNT(superssn) FROM employee GROUP BY superssn;

Following are also OK SELECT superssn, COUNT(ssn) FROM employee GROUP BY superssn; SELECT superssn, COUNT(*) FROM employee GROUP BY superssn;
8/26/2011 Database Systems 30

Example:

List employee-ssn along with count of employees they are supervising that are supervising more than 2 employees

8/26/2011

Database Systems

31

List employee-ssn along with count of employees they are supervising that are supervising more than 2 employees
SELECT superssn, COUNT(superssn) FROM employee GROUP BY superssn HAVING count ( superssn ) > 2;

8/26/2011

Database Systems

32

Example:
List employee-ssn, employee-fname, no of emps supervised for emps that are supervising more than 2 employees

8/26/2011

Database Systems

33

List employee-ssn, employee-fname, no of emps supervised for emps that are supervising more than 2 employees
SELECT emp.ssn, emp.fname, ssns.NoOfEmpsSupervised FROM employee AS emp JOIN (SELECT superssn, COUNT(superssn) AS NoOfEmpsSupervised FROM employee GROUP BY superssn HAVING count ( superssn ) > 2) AS ssns ON emp.ssn = ssns.superssn;

8/26/2011

Database Systems

34

More queries
Find out average salary manager List second level supervisors (may not precisely aggregation query)

8/26/2011

Database Systems

35

Thanks

8/26/2011

Database Systems

36

Você também pode gostar