Você está na página 1de 34

Module 3: Grouping and Summarizing Data

Module 3: Grouping and Summarizing Data


Summarizing Data by Using Aggregate Functions Summarizing Grouped Data

Ranking Grouped Data


Creating Crosstab Queries

Lesson 1: Summarizing Data by Using Aggregate Functions


Aggregate Functions Native to SQL Server Using Aggregate Functions with NULL Values

CLR Integration, Assemblies


Implementing Custom Aggregate Functions

Aggregate Functions Native to SQL Server


Can be used in:

The select list of a SELECT statement A COMPUTE or COMPUTE BY clause A HAVING clause

USE AdventureWorks USE AdventureWorks SELECT AVG(VacationHours)AS MAX(TaxRate) 'AverageVacationHours', SUM SELECT COUNT(*) (SickLeaveHours) AS 'TotalSickLeaveHours Sales.SalesTaxRate FROM Sales.SalesPerson FROM HumanResources.Employee GROUP SalesQuota WHERE BY TaxType;> 25000; WHERE Title LIKE 'Vice President%'

Using Aggregate Functions With NULL Values


Most aggregate functions ignore NULL values

NULL values may produce unexpected or incorrect results Use the ISNULL function to correct this issue USE AdventureWorks SELECT AVG(ISNULL(Weight,0)) AS AvgWeight FROM Production.Product

The COUNT(*) function is an exception and returns the total number of records in a table

CLR Integration, Assemblies


Introduced in SQL Server 2005

Deploy custom developed CLR assemblies within the SQL Server Process Leverage the .NET 2.0 API

Implementing Custom Aggregate Functions


Custom programs in an assembly that are imported

into SQL Server using the integrated CLR

Can be created using any .NET language such as C# and VB CREATE ASSEMBLY StringUtilities FROM PathToAssembly\StringUtilities.dll' WITH PERMISSION_SET=SAFE; CREATE AGGREGATE Concatenate(@input nvarchar(4000)) RETURNS nvarchar(4000) EXTERNAL NAME [StringUtilities].[Concatenate]

Demonstration: Using Common Aggregate Functions


This demonstration shows how to use some of the

common aggregate functions in SQL Server 2008. The Human Resources (HR) department of Adventure Works requires information about employees. This would be an excellent situation to show the usage of several aggregate functions.

Lesson 2: Summarizing Grouped Data


Using the GROUP BY clause Filtering Grouped Data by Using the HAVING Clause

Building a Query for Summarizing Grouped Data GROUP

BY

Examining How the ROLLUP and CUBE Operators Work Using the ROLLUP and CUBE Operators Using the COMPUTE and COMPUTE BY Clauses Building a Query for Summarizing Grouped Data -

COMPUTE

Using GROUPING SETS

Using the GROUP BY Clause


Specifies the groups into which the output rows must be placed
Calculates a summary value for aggregate functions

SELECT SalesOrderID, SUM(LineTotal) AS SubTotal


FROM Sales.SalesOrderDetail GROUP BY SalesOrderID ORDER BY SalesOrderID
SalesOrderID SubTotal

1
2 3 4

23761
45791 75909 19900

Source Table

Group 1

Group 2

Filtering Grouped Data by Using the HAVING Clause


Specifies a search condition for a group
Can be used only with the SELECT statement

SELECT SalesOrderID, SUM(LineTotal) AS SubTotal

FROM Sales.SalesOrderDetail
GROUP BY SalesOrderID HAVING SUM(LineTotal) > 100000.00

SalesOrderID 43875

SubTotal 121761.939600

43884
44518 44528 44530 44795 46066 ...

115696.331324
126198.336168 108783.587200 104958.806836 104111.515642 100378.907800

ORDER BY SalesOrderID

Building a Query for Summarizing Grouped Data GROUP BY


GROUP BY
SELECT A.City, COUNT(E. BusinessEntityID) EmployeeCount FROM HumanResources.Employee E INNER JOIN Person.Address A ON E.BusinessEntityID = A.AddressID GROUP BY A.City ORDER BY A.City;
City EmployeeCount

Bellevue
Berlin Bordeaux Bothell Calgary Cambridge ...

35
1 1 22 1 2

GROUP BY with HAVING clause


'Total Order Amount'
FROM Sales.SalesOrderHeader GROUP BY DATEPART(yyyy,OrderDate) HAVING DATEPART(yyyy,OrderDate) >= '2003' ORDER BY DATEPART(yyyy,OrderDate)

SELECT DATEPART(yyyy,OrderDate) AS 'Year' ,SUM(TotalDue) AS

Year

Total Order Amount

2003
2004

54307615.0868
32196912.4165

Examining How the ROLLUP and CUBE Operators Work


ROLLUP and CUBE generate summary information in a

query

ROLLUP generates a result set showing the aggregates

for a hierarchy of values in selected columns

SELECT a, b, c, SUM ( <expression> )


FROM T GROUP BY ROLLUP (a,b,c)
CUBE generates a result set that shows the aggregates

for all combination of values in selected columns SELECT a, b, c, SUM (<expression>) FROM T GROUP BY CUBE (a,b,c)

Using the ROLLUP and CUBE Operators


SELECT ProductID,Shelf,SUM(Quantity) AS QtySum FROM Production.ProductInventory WHERE ProductID<6 GROUP BY ROLLUP(ProductID,Shelf)
ProductID 1 1 1 2 2 2 3 3 3 Shelf A B NULL A B NULL A B NULL QtySum 761 324 1085 791 318 1109 909 443 1352

4 ProductID
1

A Shelf
A A A A A B B B B B

900 QtySum
761 791 909 900 3361 324 318 443 442 1507

SELECT ProductID,Shelf,SUM(Quantity) AS QtySum FROM Production.ProductInventory WHERE ProductID<6 GROUP BY CUBE(ProductID, Shelf)

2 3 4 NULL 1 2 3 4 NULL

Demonstration: How to Use the ROLLUP and CUBE Operators


This demonstration shows how to use the ROLLUP and

CUBE operator extensions of the GROUP BY SELECT clause. You have been asked to summarize some information in the AdventureWorks database. This would be an excellent usage of these new operators.

Examining the COMPUTE and COMPUTE BY Clauses


COMPUTE generates additional summary rows in a non-

relational format

COMPUTE BY generates control-breaks and subtotals in

the result set

Both COMPUTE and COMPUTE BY can be used in the

same query

Building a Query For Summarizing Grouped Data - COMPUTE


COMPUTE
SELECT SalesOrderID, UnitPrice, UnitPriceDiscount FROM Sales.SalesOrderDetail ORDER BY SalesOrderID COMPUTE SUM(UnitPrice), SUM(UnitPriceDiscount)
SalesOrderID 1 2 3 UnitPrice 50 135 NULL UnitPriceDiscount 200 350 1085

sum
185

Sum
1635

COMPUTE BY

SELECT SalesPersonID, CustomerID, OrderDate, SubTotal, TotalDue FROM Sales.SalesOrderHeader ORDER BY SalesPersonID, OrderDate COMPUTE SUM(SubTotal), SUM(TotalDue) BY SalesPersonID

Using GROUPING SETS


New GROUP BY operator that aggregates several

groupings in one query

Eliminates multiple GROUP BY queries


SELECT T.[Group] AS 'Region', T.CountryRegionCode AS 'Country' ,S.Name AS 'Store', H.SalesPersonID ,SUM(TotalDue) AS 'Total Sales' FROM Sales.Customer C INNER JOIN Sales.Store S ON C.StoreID = S.BusinessEntityID INNER JOIN Sales.SalesTerritory T ON C.TerritoryID = T.TerritoryID INNER JOIN Sales.SalesOrderHeader H ON C.CustomerID = H.CustomerID WHERE T.[Group] = 'Europe' AND T.CountryRegionCode IN('DE', 'FR') AND SUBSTRING(S.Name,1,4)IN('Vers', 'Spa ')

Region

Country

Store

NULL
NULL NULL

NULL
NULL NULL

NULL
NULL NULL

NULL
NULL NULL NULL Europe

NULL
NULL DE FR NULL

Spa and Exercise Outfitters


Versatile Sporting Good Company NULL NULL NULL

GROUP BY GROUPING SETS


(T.[Group], T.CountryRegionCode, S.Name, H.SalesPersonID) ORDER BY T.[Group], T.CountryRegionCode, S.Name, H.SalesPersonID;

Supports additional options as ability to use with

ROLLUP and CUBE operators

Lesson 3: Ranking Grouped Data


What Is Ranking? Ranking Data by Using RANK

Ranking Data by Using DENSE_RANK


Ranking Data by Using ROW_NUMBER Ranking Data by Using NTILE

Categorizing the Ranking Functions Based On Their

Functionality

What s Ranking?
Ranking computes a value for each row in user-specified

set of rows.

Ranking functions include: RANK DENSE_RANK NTILE ROW_NUMBER Can be used for: Arbitrary row numbering Product Popularity Best Customers

Ranking Data by Using RANK


SELECT P.Name Product, P.ListPrice, PSC.Name Category, RANK() OVER(PARTITION BY PSC.Name ORDER BY P.ListPrice DESC) AS PriceRank FROM Production.Product P JOIN Production.ProductSubCategory PSC ON P.ProductSubCategoryID = PSC.ProductSubCategoryID
Product Men's Bib-Shorts, S Men's Bib-Shorts, M Men's Bib-Shorts, L Hitch Rack - 4-Bike ListPrice 89.99 89.99 89.99 120.00 Category Bib-Shorts Bib-Shorts Bib-Shorts Bike Racks PriceRank 1 1 1 1

All-Purpose Bike Stand


Mountain Bottle Cage Road Bottle Cage

159.00
9.99 8.99

Bike Stands
Bottles and Cages Bottles and Cages

1
1 2

Ranking Data by Using DENSE_RANK


SELECT P.Name Product, P.ListPrice, PSC.Name Category, DENSE_RANK() OVER(PARTITION BY PSC.Name ORDER BY P.ListPrice DESC) AS PriceRank FROM Production.Product P JOIN Production.ProductSubCategory PSC ON P.ProductSubCategoryID = PSC.ProductSubCategoryID

Product HL Mountain Handlebars HL Road Handlebars HL Touring Handlebars ML Mountain Handlebars HL Headset ML Headset LL Headset

ListPrice 120.27 120.27 91.57 61.92 124.73 102.29 34.20

Category Handlebars Handlebars Handlebars Handlebars Headsets Headsets Headsets

PriceRank 1 1 2 3 1 2 3

Ranking Data by Using ROW_NUMBER


SELECT ROW_NUMBER()
OVER(PARTITION BY PC.Name ORDER BY ListPrice) AS Row, PC.Name Category, P.Name Product, P.ListPrice FROM Production.Product P

JOIN Production.ProductSubCategory PSC


ON P.ProductSubCategoryID = PSC.ProductSubCategoryID JOIN Production.ProductCategory PC ON PSC.ProductCategoryID = PC.ProductCategoryID

Row Category 1 2 1 2 Accessories Accessories Bikes Bikes

Product Patch Kit/8 Patches Road Tire Tube Road-750 Black, 44 Road-750 Black, 44

ListPrice 2.29 3.99 539.99 539.99

Ranking Data by Using NTILE


SELECT NTILE(3) OVER(PARTITION BY PC.Name ORDER BY ListPrice) AS PriceBand, PC.Name Category, P.Name Product, P.ListPrice FROM Production.Product P JOIN Production.ProductSubCategory PSC ON P.ProductSubCategoryID = PSC.ProductSubCategoryID JOIN Production.ProductCategory PC ON PSC.ProductCategoryID = PC.ProductCategoryID

PriceBand Category

Product

ListPrice

1
1 2 2 3 3 1 1 2

Accessories Patch
Accessories Road Accessories LL Accessories ML

Kit/8 Patches
Tire Tube Road Tire Road Tire

2.29
3.99 21.49 24.99 34.99 35.00 539.99 782.99 782.99

Accessories Sport-100 Helmet, Blue Accessories HL Bikes Bikes Bikes ... Mountain Tire Road-750 Black, 44 Road-650 Red, 48 Road-650 Red, 52

Categorizing the Ranking Functions Based On Their Functionality

Feature

RANK

DENSE_RANK

NTILE

ROW_NUMBER

Rows equally distributed among groups

Rows of equal rank are assigned the same value


Sequential numbering of rows within a partition

Lesson 4: Creating Crosstab Queries


How the PIVOT and UNPIVOT Operators Work Using the PIVOT Operator

Using the UNPIVOT Operator


Grouping and Summarization Features New to SQL Server

2008

How the PIVOT and UNPIVOT Operators Work


PIVOT converts values to columns
Cust Mike Mike Mike Lisa Lisa Prod Bike Chain Bike Bike Chain Qty 3 2 5 3 3

Cust Mike Lisa

Bike Chain 8 3 2 7

Lisa

Chain

UNPIVOT converts columns to values


Cust Cust Mike Lisa Bike Chain 8 3 2 7 Mike Mike Lisa Prod Bike Chain Bike Qty 8 2 3

Lisa

Chain

Using the PIVOT Operator

PIVOT converts values to columns


Cust Mike Mike Prod Bike Chain Qty 3 2 SELECT * FROM Sales.Order PIVOT (SUM(Qty) FOR Prod IN ([Bike],[Chain])) PVT Cust Bike Chain

Mike
Lisa Lisa

Bike
Bike Chain

5
3 3

Mike Lisa

8 3

2 7

Lisa

Chain

Using the UNPIVOT Operator

UNPIVOT converts columns to values


Cust Cust Mike Lisa Bike Chain 8 3 2 7 SELECT Cust, Prod, Qty FROM Sales.PivotedOrder UNPIVOT (Qty FOR Prod IN ([Bike],[Chain])) UnPVT Mike Mike Prod Bike Chain Qty 8 2

Lisa
Lisa

Bike
Chain

3
7

Grouping and Summarization Features New to SQL Server 2008

GROUPING SETS Operator

ROLLUP Operator

CUBE Operator

GROUPING_ID Function

Lab: Grouping and Summarizing Data


Exercise 1: Summarizing Data by Using Aggregate

Functions

Exercise 2: Summarizing Grouped Data Exercise 3: Ranking Grouped Data Exercise 4: Creating Crosstab Queries

Logon information

Virtual machine User name Password

NY-SQL-01 Administrator Pa$$w0rd

Estimated time: 60 minutes

Lab Scenario
You are the database administrator of Adventure Works.

The HR department wants you to prepare a report on the number of vacation hours of the vice presidents of the company, and another report on the number of employees with incomplete addresses. The warehouse manager requires three reports on the average number of hours taken to manufacture a product, the order quantity and the price list of each product, and the safety stock level for red, blue, and black helmets. Besides these requests, the marketing manager wants you to prepare a report on the year-to-date sales of salespersons and rank their names based on their performance.

Lab Review
When you execute queries with aggregate functions you

may see a warning message Null value is eliminated by an aggregate or other SET operation. Why do you get this?
xyz is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause. What did you forget to do? on their year to date sales, were the amount of results in each category even? Why or why not?

When executing your query you receive the error Column

When ranking the salespersons into four categories based

Module Review and Takeaways


Review Questions

Você também pode gostar