HTTP WWW - Akadia

Analytic Functions in Oracle 8i and 9i
Contents
Overview and Introduction

How Analytic Functions Work
The Syntax
Calculate a running Total
Top-N Queries
Example 1
Example 2
Windows
Range Windows
Compute average salary for defined range
Row Windows
Accessing Rows Around Your Current Row
LAG
LEAD
Determine the First Value / Last Value of a Group
Crosstab or Pivot Queries
Conclusion
Overview
Analytic Functions, which have been available since Oracle 8.1.6, are
designed to address such problems as "Calculate a running total", "Find
percentages within a group", "Top-N queries", "Compute a moving average"
and many more. Most of these problems can be solved using standard PL/
SQL, however the performance is often not what it should be. Analytic
Functions add extensions to the SQL language that not only make these
operations easier to code; they make them faster than could be achieved
with pure SQL or PL/SQL. These extensions are currently under review by the
ANSI SQL committee for inclusion in the SQL specification.
How Analytic Functions Work ?
Analytic functions compute an aggregate value based on a group of rows.

They differ from aggregate functions in that they return multiple rows for
each group. The group of rows is called a window and is defined by the
analytic clause. For each row, a "sliding" window of rows is defined. The
window determines the range of rows used to perform the calculations for
the "current row". Window sizes can be based on either a physical number of
rows or a logical interval such as time.
Analytic functions are the last set of operations performed in a query except
http://www.akadia.com/services/ora_analytic_functions.html (1 of 17)1/22/2008 6:09:30 PM

for the final ORDER BY clause. All joins and all WHERE, GROUP BY, and
HAVING clauses are completed before the analytic functions are processed.
Therefore, analytic functions can appear only in the select list or ORDER BY
clause.
The Syntax
The Syntax of analytic functions is rather straightforward in appearance
Analytic-Function(<Argument>,<Argument>,...)
OVER (
<Query-Partition-Clause>
<Order-By-Clause>
<Windowing-Clause>
)
● Analytic-Function
Specify the name of an analytic function, Oracle actually

provides many analytic functions such as AVG, CORR,
COVAR_POP, COVAR_SAMP, COUNT, CUME_DIST,
DENSE_RANK, FIRST, FIRST_VALUE, LAG, LAST, LAST_VALUE,
LEAD, MAX, MIN, NTILE, PERCENT_RANK, PERCENTILE_CONT,
PERCENTILE_DISC, RANK, RATIO_TO_REPORT, STDDEV,
STDDEV_POP, STDDEV_SAMP, SUM, VAR_POP, VAR_SAMP,
VARIANCE.
● Arguments
Analytic functions take 0 to 3 arguments.
● Query-Partition-Clause
The PARTITION BY clause logically breaks a single result set into

N groups, according to the criteria set by the partition
expressions. The words "partition" and "group" are used
synonymously here. The analytic functions are applied to each
group independently, they are reset for each group.
● Order-By-Clause
The ORDER BY clause specifies how the data is sorted within

each group (partition). This will definitely affect the outcome of
any analytic function.
● Windowing-Clause
The windowing clause gives us a way to define a sliding or

anchored window of data, on which the analytic function will

operate, within a group. This clause can be used to have the
analytic function compute its value based on any arbitrary
sliding or anchored window within a group. More information on
windows can be found here.
Example: Calculate a running Total
This example shows the cumulative salary within a departement row by row,
with each row including a summation of the prior rows salary.
set autotrace traceonly explain

break on deptno skip 1
column ename format A6
column deptno format 999
column sal format 99999
column seq format 999
SELECT ename "Ename", deptno "Deptno", sal "Sal",

SUM(sal)
OVER (ORDER BY deptno, ename) "Running Total",
SUM(SAL)
OVER (PARTITION BY deptno
ORDER BY ename) "Dept Total",
ROW_NUMBER()
ORDER BY ENAME) "Seq"
FROM emp
ORDER BY deptno, ename
/
Ename Deptno Sal Running Total Dept Total Seq

------ ------ ------ ------------- ---------- ----
CLARK 10 2450 2450 2450 1
KING 5000 7450 7450 2
MILLER 1300 8750 8750 3
ADAMS 20 1100 9850 1100 1

FORD 3000 12850 4100 2
JONES 2975 15825 7075 3
SCOTT 3000 18825 10075 4
SMITH 800 19625 10875 5
ALLEN 30 1600 21225 1600 1

BLAKE 2850 24075 4450 2
JAMES 950 25025 5400 3
MARTIN 1250 26275 6650 4
TURNER 1500 27775 8150 5
WARD 1250 29025 9400 6

Execution Plan
---------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 WINDOW (SORT)
2 1 TABLE ACCESS (FULL) OF 'EMP'
Statistics
---------------------------------------------------
0 recursive calls
0 db block gets
3 consistent gets
0 physical reads
0 redo size
1658 bytes sent via SQL*Net to client
503 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
14 rows processed
The example shows how to calculate a "Running Total" for the entire
query. This is done using the entire ordered result set, via SUM(sal) OVER
(ORDER BY deptno, ename).
Further, we were able to compute a running total within each department,

a total that would be reset at the beginning of the next department. The
PARTITION BY deptno in that SUM(sal) caused this to happen, a partitioning
clause was specified in the query in order to break the data up into groups.
The ROW_NUMBER() function is used to sequentially number the rows

returned in each group, according to our ordering criteria (a "Seq" column
was added to in order to display this position).
The execution plan shows, that the whole query is very well performed with
only 3 consistent gets, this can never be accomplished with standard SQL or
even PL/SQL.
Top-N Queries
How can we get the Top-N records by some set of fields ?

Prior to having access to these analytic functions, questions of this nature
were extremely difficult to answer.
There are some problems with Top-N queries however; mostly in the way
people phrase them. It is something to be careful about when designing
reports. Consider this seemingly sensible request:
I would like the top three paid sales reps by department
The problem with this question is that it is ambiguous. It is ambiguous

because of repeated values, there might be four people who all make the
same salary, what should we do then ?
Let's look at three examples, all use the well known table EMP.
Example 1
Sort the sales people by salary from greatest to least. Give the first three
rows. If there are less then three people in a department, this will return less
than three records.
set autotrace on explain

SELECT * FROM (
SELECT deptno, ename, sal, ROW_NUMBER()
OVER (
PARTITION BY deptno ORDER BY sal DESC
) Top3 FROM emp
)
WHERE Top3 <= 3
/
DEPTNO ENAME SAL TOP3

---------- ---------- ---------- ----------
10 KING 5000 1
CLARK 2450 2
MILLER 1300 3
20 SCOTT 3000 1
FORD 3000 2
JONES 2975 3
30 BLAKE 2850 1
ALLEN 1600 2
TURNER 1500 3
9 rows selected.
Execution Plan
--------------------------------------------
1 0 VIEW
2 1 WINDOW (SORT)
This query works by sorting each partition (or group, which is the deptno), in
a descending order, based on the salary column and then assigning a
sequential row number to each row in the group as it is processed. The use
of a WHERE clause after doing this to get just the first three rows in each
partition.

Example 2
Give me the set of sales people who make the top 3 salaries - that is, find
the set of distinct salary amounts, sort them, take the largest three, and give
me everyone who makes one of those values.
SELECT * FROM (
SELECT deptno, ename, sal,
DENSE_RANK()
OVER (
PARTITION BY deptno ORDER BY sal desc
) TopN FROM emp
)
WHERE TopN <= 3
ORDER BY deptno, sal DESC
/
DEPTNO ENAME SAL TOPN

---------- ---------- ---------- ----------
10 KING 5000 1
CLARK 2450 2
MILLER 1300 3
20 SCOTT 3000 1 <--- !

FORD 3000 1 <--- !
JONES 2975 2
ADAMS 1100 3
30 BLAKE 2850 1
ALLEN 1600 2
30 TURNER 1500 3
10 rows selected.
Execution Plan
--------------------------------------------
1 0 VIEW
2 1 WINDOW (SORT PUSHED RANK)
Here the DENSE_RANK function was used to get the top three salaries. We
assigned the dense rank to the salary column and sorted it in a descending
order.
The DENSE_RANK function computes the rank of a row in an ordered group

of rows. The ranks are consecutive integers beginning with 1. The largest
rank value is the number of unique values returned by the query. Rank
values are not skipped in the event of ties. Rows with equal values for the
ranking criteria receive the same rank.

The DENSE_RANK function does not skip numbers and will assign the same
number to those rows with the same value. Hence, after the result set is built
in the inline view, we can simply select all of the rows with a dense rank of
three or less, this gives us everyone who makes the top three salaries by
department number.
Windows
The windowing clause gives us a way to define a sliding or anchored window

of data, on which the analytic function will operate, within a group. The
default window is an anchored window that simply starts at the first row of
a group an continues to the current row.
We can set up windows based on two criteria: RANGES of data values or

ROWS offset from the current row. It can be said, that the existance of
an ORDER BY in an analytic function will add a default window clause of
RANGE UNBOUNDED PRECEDING. That says to get all rows in our partition
that came before us as specified by the ORDER BY clause.
Let's look at an example with a sliding window within a group and compute
the sum of the current row's SAL column plus the previous 2 rows in that
group. If we need a report that shows the sum of the current employee's
salary with the preceding two salaries within a departement, it would look
like this.

column ename format A6
column deptno format 999
column sal format 99999
SELECT deptno "Deptno", ename "Ename", sal "Sal",

SUM(SAL)
ORDER BY ename
ROWS 2 PRECEDING) "Sliding Total"
FROM emp
ORDER BY deptno, ename
/
Deptno Ename Sal Sliding Total

------ ------ ------ -------------
10 CLARK 2450 2450
KING 5000 7450
MILLER 1300 8750
20 ADAMS 1100 1100

FORD 3000 4100
JONES 2975 7075 ^
SCOTT 3000 8975 |

SMITH 800 6775 \-- Sliding Window
30 ALLEN 1600 1600

BLAKE 2850 4450
JAMES 950 5400
MARTIN 1250 5050
TURNER 1500 3700
WARD 1250 4000
The partition clause makes the SUM (sal) be computed within each
department, independent of the other groups. Tthe SUM (sal) is ' reset ' as
the department changes. The ORDER BY ENAME clause sorts the data within
each department by ENAME; this allows the window clause: ROWS 2
PRECEDING, to access the 2 rows prior to the current row in a group
in order to sum the salaries.
For example, if you note the SLIDING TOTAL value for SMITH is 6 7 7 5,
which is the sum of 800, 3000, and 2975. That was simply SMITH's row plus
the salary from the preceding two rows in the window.
Range Windows
Range windows collect rows together based on a WHERE clause. If I say '
range 5 preceding ' for example, this will generate a sliding window that has
the set of all preceding rows in the group such that they are within 5 units of
the current row. These units may either be numeric comparisons or date
comparisons and it is not valid to use RANGE with datatypes other than
numbers and dates.
Example
Count the employees which where hired within the last 100 days preceding
the own hiredate. The range window goes back 100 days from the current
row's hiredate and then counts the rows within this range. The solution ist to
use the following window specification:
COUNT(*) OVER (ORDER BY hiredate ASC RANGE 100 PRECEDING)
column ename heading "Name" format a8

column hiredate heading "Hired" format a10
column hiredate_pre heading "Hired-100" format a10
column cnt heading "Cnt" format 99
SELECT ename, hiredate, hiredate-100 hiredate_pre,

COUNT(*)
OVER (
ORDER BY hiredate ASC
RANGE 100 PRECEDING
) cnt
FROM emp


/
Name Hired Hired-100 Cnt

-------- ---------- ---------- ---
SMITH 17-DEC-80 08-SEP-80 1
ALLEN 20-FEB-81 12-NOV-80 2
WARD 22-FEB-81 14-NOV-80 3
JONES 02-APR-81 23-DEC-80 3
BLAKE 01-MAY-81 21-JAN-81 4
CLARK 09-JUN-81 01-MAR-81 3
TURNER 08-SEP-81 31-MAY-81 2
MARTIN 28-SEP-81 20-JUN-81 2
KING 17-NOV-81 09-AUG-81 3
JAMES 03-DEC-81 25-AUG-81 5
FORD 03-DEC-81 25-AUG-81 5
MILLER 23-JAN-82 15-OCT-81 4
SCOTT 09-DEC-82 31-AUG-82 1
ADAMS 12-JAN-83 04-OCT-82 2
We ordered the single partition by hiredate ASC. If we look for example at

the row for CLARK we can see that his hiredate was 09-JUN-81, and 100
days prior to that is the date 01-MAR-81. If we look who was hired between
01-MAR-81 and 09-JUN-81, we find JONES (hired: 02-APR-81) and BLAKE
(hired: 01-MAY-81). This are 3 rows including the current row, this is what
we see in the column "Cnt" of CLARK's row.
Compute average salary for defined range
As an example, compute the average salary of people hired within 100 days
before for each employee. The query looks like this:
column ename heading "Name" format a8

column hiredate heading "Hired" format a10
column hiredate_pre heading "Hired-100" format a10
column avg_sal heading "Avg-100" format 999999
SELECT ename, hiredate, sal,

AVG(sal)
OVER (
RANGE 100 PRECEDING
) avg_sal
FROM emp
/
Name Hired SAL Avg-100

-------- ---------- ---------- -------
SMITH 17-DEC-80 800 800

ALLEN 20-FEB-81 1600 1200

WARD 22-FEB-81 1250 1217
JONES 02-APR-81 2975 1942
BLAKE 01-MAY-81 2850 2169
CLARK 09-JUN-81 2450 2758
TURNER 08-SEP-81 1500 1975
MARTIN 28-SEP-81 1250 1375
KING 17-NOV-81 5000 2583
JAMES 03-DEC-81 950 2340
FORD 03-DEC-81 3000 2340
MILLER 23-JAN-82 1300 2563
SCOTT 09-DEC-82 3000 3000
ADAMS 12-JAN-83 1100 2050
Look at CLARK again, since we understand his range window within the
group. We can see that the average salary of 2758 is equal to (2975+2850
+2450)/3. This is the average of the salaries for CLARK and the rows
preceding CLARK, those of JONES and BLAKE. The data must be sorted in
ascending order.
Row Windows
Row Windows are physical units; physical number of rows, to include in the
window. For example you can calculate the average salary of a given record
with the (up to 5) employees hired before them or after them as follows:
set numformat 9999

SELECT ename, hiredate, sal,
AVG(sal)
OVER (ORDER BY hiredate ASC ROWS 5 PRECEDING) AvgAsc,
COUNT(*)
OVER (ORDER BY hiredate ASC ROWS 5 PRECEDING) CntAsc,
AVG(sal)
OVER (ORDER BY hiredate DESC ROWS 5 PRECEDING) AvgDes,
COUNT(*)
OVER (ORDER BY hiredate DESC ROWS 5 PRECEDING) CntDes
FROM emp
ORDER BY hiredate
/
ENAME HIREDATE SAL AVGASC CNTASC AVGDES CNTDES

---------- --------- ----- ------ ------ ------ ------
SMITH 17-DEC-80 800 800 1 1988 6
ALLEN 20-FEB-81 1600 1200 2 2104 6
WARD 22-FEB-81 1250 1217 3 2046 6
JONES 02-APR-81 2975 1656 4 2671 6
BLAKE 01-MAY-81 2850 1895 5 2675 6
CLARK 09-JUN-81 2450 1988 6 2358 6
TURNER 08-SEP-81 1500 2104 6 2167 6
MARTIN 28-SEP-81 1250 2046 6 2417 6

KING 17-NOV-81 5000 2671 6 2392 6

JAMES 03-DEC-81 950 2333 6 1588 4
FORD 03-DEC-81 3000 2358 6 1870 5
MILLER 23-JAN-82 1300 2167 6 1800 3
SCOTT 09-DEC-82 3000 2417 6 2050 2
ADAMS 12-JAN-83 1100 2392 6 1100 1
The window consist of up to 6 rows, the current row and five rows " in front
of " this row, where " in front of " is defined by the ORDER BY clause. With
ROW partitions, we do not have the limitation of RANGE partition - the data
may be of any type and the order by may include many columns. Notice,
that we selected out a COUNT(*) as well. This is useful just to demonstrate
how many rows went into making up a given average. We can see clearly
that for ALLEN's record, the average salary computation for people hired
before him used only 2 records whereas the computation for salaries of
people hired after him used 6.
Accessing Rows Around Your Current Row
Frequently you want to access data not only from the current row but the
current row " in front of " or " behind " them. For example, let's say you need
a report that shows, by department all of the employees; their hire date;
how many days before was the last hire; how many days after was the next
hire.
Using straight SQL this query would be difficult to write. Not only that but its
performance would once again definitely be questionable. The approach I
typically took in the past was either to " select a select " or write a PL/SQL
function that would take some data from the current row and " find " the
previous and next rows data. This worked, but introduce large overhead into
both the development of the query and the run-time execution of the query.
Using analytic functions, this is easy and efficient to do.
set echo on
column deptno format 99 heading Dep

column ename format a6 heading Ename
column hiredate heading Hired
column last_hire heading LastHired
column days_last heading DaysLast
column next_hire heading NextHire
column days_next heading NextDays
SELECT deptno, ename, hiredate,

LAG(hiredate,1,NULL)
ORDER BY hiredate, ename) last_hire,
hiredate - LAG(hiredate,1,NULL)


ORDER BY hiredate, ename) days_last,
LEAD(hiredate,1,NULL)
ORDER BY hiredate, ename) next_hire,
LEAD(hiredate,1,NULL)
ORDER BY hiredate, ename) - hiredate days_next
FROM emp
ORDER BY deptno, hiredate
/
Dep Ename Hired LastHired DaysLast NextHire NextDays

--- ------ --------- --------- -------- --------- --------
10 CLARK 09-JUN-81 17-NOV-81 161
KING 17-NOV-81 09-JUN-81 161 23-JAN-82 67
MILLER 23-JAN-82 17-NOV-81 67
20 SMITH 17-DEC-80 02-APR-81 106

JONES 02-APR-81 17-DEC-80 106 03-DEC-81 245
FORD 03-DEC-81 02-APR-81 245 09-DEC-82 371
SCOTT 09-DEC-82 03-DEC-81 371 12-JAN-83 34
ADAMS 12-JAN-83 09-DEC-82 34
30 ALLEN 20-FEB-81 22-FEB-81 2

WARD 22-FEB-81 20-FEB-81 2 01-MAY-81 68
BLAKE 01-MAY-81 22-FEB-81 68 08-SEP-81 130
TURNER 08-SEP-81 01-MAY-81 130 28-SEP-81 20
MARTIN 28-SEP-81 08-SEP-81 20 03-DEC-81 66
JAMES 03-DEC-81 28-SEP-81 66
The LEAD and LAG routines could be considered a way to " index into your
partitioned group ". Using these functions you can access any individual row.
Notice for example in the above printout, it shows that the record for KING
includes the data (in bold red font) from the prior row (LAST HIRE) and the
next row (NEXT-HIRE). We can access the fields in records preceding or
following the current record in an ordered partition easily.
LAG
LAG ( value_expr [, offset] [, default] )

OVER ( [query_partition_clause] order_by_clause )
LAG provides access to more than one row of a table at the same time
without a self join. Given a series of rows returned from a query and a
position of the cursor, LAG provides access to a row at a given physical offset
prior to that position.
If you do not specify offset, then its default is 1. The optional default value is
returned if the offset goes beyond the scope of the window. If you do not

specify default, then its default value is null.
The following example provides, for each person in the EMP table, the salary
of the employee hired just before:
SELECT ename,hiredate,sal,
LAG(sal, 1, 0)
OVER (ORDER BY hiredate) AS PrevSal
FROM emp
WHERE job = 'CLERK';
Ename Hired SAL PREVSAL

------ --------- ----- -------
SMITH 17-DEC-80 800 0
JAMES 03-DEC-81 950 800
MILLER 23-JAN-82 1300 950
ADAMS 12-JAN-83 1100 1300
LEAD
LEAD ( value_expr [, offset] [, default] )

OVER ( [query_partition_clause] order_by_clause )
LEAD provides access to more than one row of a table at the same time
without a self join. Given a series of rows returned from a query and a
position of the cursor, LEAD provides access to a row at a given physical
offset beyond that position.
If you do not specify offset, then its default is 1. The optional default value is
returned if the offset goes beyond the scope of the table. If you do not
specify default, then its default value is null.
The following example provides, for each employee in the EMP table, the hire
date of the employee hired just after:
SELECT ename, hiredate,

LEAD(hiredate, 1)
OVER (ORDER BY hiredate) AS NextHired
FROM emp WHERE deptno = 30;
Ename Hired NEXTHIRED

------ --------- ---------
ALLEN 20-FEB-81 22-FEB-81
WARD 22-FEB-81 01-MAY-81
BLAKE 01-MAY-81 08-SEP-81
TURNER 08-SEP-81 28-SEP-81
MARTIN 28-SEP-81 03-DEC-81
JAMES 03-DEC-81

Determine the First Value / Last Value of a Group
The FIRST_VALUE and LAST_VALUE functions allow you to select the first
and last rows from a group. These rows are especially valuable because they
are often used as the baselines in calculations.
Example
The following example selects, for each employee in each department, the
name of the employee with the lowest salary.

FIRST_VALUE(ename)
ORDER BY sal ASC) AS MIN_SAL_HAS
FROM emp
ORDER BY deptno, ename;
DEPTNO ENAME SAL MIN_SAL_HAS

---------- ---------- ---------- -----------
10 CLARK 2450 MILLER
KING 5000 MILLER
MILLER 1300 MILLER
20 ADAMS 1100 SMITH

FORD 3000 SMITH
JONES 2975 SMITH
SCOTT 3000 SMITH
SMITH 800 SMITH
30 ALLEN 1600 JAMES

BLAKE 2850 JAMES
JAMES 950 JAMES
MARTIN 1250 JAMES
TURNER 1500 JAMES
WARD 1250 JAMES
The following example selects, for each employee in each department, the
name of the employee with the highest salary.

FIRST_VALUE(ename)
ORDER BY sal DESC) AS MAX_SAL_HAS
FROM emp
ORDER BY deptno, ename;
DEPTNO ENAME SAL MAX_SAL_HAS

---------- ---------- ---------- -----_-----

10 CLARK 2450 KING
KING 5000 KING
MILLER 1300 KING
20 ADAMS 1100 FORD

FORD 3000 FORD
JONES 2975 FORD
SCOTT 3000 FORD
SMITH 800 FORD
30 ALLEN 1600 BLAKE

BLAKE 2850 BLAKE
JAMES 950 BLAKE
MARTIN 1250 BLAKE
TURNER 1500 BLAKE
WARD 1250 BLAKE
The following example selects, for each employee in department 30 the

name of the employee with the lowest salary using an inline view

FIRST_VALUE(ename)
OVER (ORDER BY sal ASC) AS MIN_SAL_HAS
FROM (SELECT * FROM emp WHERE deptno = 30)
DEPTNO ENAME SAL MIN_SAL_HAS

---------- ---------- ---------- -----------
30 JAMES 950 JAMES
MARTIN 1250 JAMES
WARD 1250 JAMES
TURNER 1500 JAMES
ALLEN 1600 JAMES
BLAKE 2850 JAMES
Crosstab or Pivot Queries
A crosstab query, sometimes known as a pivot query, groups your data in

a slightly different way from those we have seen hitherto. A crosstab query
can be used to get a result with three rows (one for each project), with each
row having three columns (the first listing the projects and then one column
for each year) -- like this:
Project 2001 2002

ID CHF CHF
-------------------------------
100 123.00 234.50
200 543.00 230.00
300 238.00 120.50

Example
Let's say you want to show the top 3 salary earners in each department as
columns. The query needs to return exactly 1 row per department and the
row would have 4 columns. The DEPTNO, the name of the highest paid
employee in the department, the name of the next highest paid, and so on.
Using analytic functions this almost easy, without analytic functions this was
virtually impossible.
SELECT deptno,
MAX(DECODE(seq,1,ename,null)) first,
MAX(DECODE(seq,2,ename,null)) second,
MAX(DECODE(seq,3,ename,null)) third
FROM (SELECT deptno, ename,
row_number()
ORDER BY sal desc NULLS LAST) seq
FROM emp)
WHERE seq <= 3
GROUP BY deptno
/
DEPTNO FIRST SECOND THIRD

---------- ---------- ---------- ----------
10 KING CLARK MILLER
20 SCOTT FORD JONES
30 BLAKE ALLEN TURNER
Note the inner query, that assigned a sequence (RowNr) to each employee
by department number in order of salary.

row_number()
ORDER BY sal desc NULLS LAST) RowNr
FROM emp;
DEPTNO ENAME SAL ROWNR

---------- ---------- ---------- ----------
10 KING 5000 1
10 CLARK 2450 2
10 MILLER 1300 3
20 SCOTT 3000 1
20 FORD 3000 2
20 JONES 2975 3
20 ADAMS 1100 4
20 SMITH 800 5
30 BLAKE 2850 1
30 ALLEN 1600 2
30 TURNER 1500 3

30 WARD 1250 4
30 MARTIN 1250 5
30 JAMES 950 6
The DECODE in the outer query keeps only rows with sequences 1, 2 or 3
and assigns them to the correct "column". The GROUP BY gets rid of the
redundant rows and we are left with our collapsed result. It may be easier to
understand if you see the resultset without the aggregate function MAX
grouped by deptno.
SELECT deptno,
DECODE(seq,1,ename,null) first,
DECODE(seq,2,ename,null) second,
DECODE(seq,3,ename,null) third
FROM (SELECT deptno, ename,
row_number()
ORDER BY sal desc NULLS LAST) seq
FROM emp)
WHERE seq <= 3
/
DEPTNO FIRST SECOND THIRD

---------- ---------- ---------- ----------
10 KING
10 CLARK
10 MILLER
20 SCOTT
20 FORD
20 JONES
30 BLAKE
30 ALLEN
30 TURNER
The MAX aggregate function will be applied by the GROUP BY column

DEPTNO. In any given DEPTNO above only one row will have a non-null value
for FIRST, the remaining rows in that group will always be NULL. The MAX
function will pick out the non-null row and keep that for us. Hence, the group
by and MAX will collapse our resultset, removing the NULL values from it and
giving us what we want.
Conclusion
This new set of functionality holds some exiting possibilities. It opens up a

whole new way of looking at the data. It will remove a lot of procedural code
and complex or inefficient queries that would have taken a long tome to
develop, to achieve the same result.

HTTP WWW - Akadia

Enviado por

Dados do documento

Descrição original:

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

HTTP WWW - Akadia

Enviado por

Direitos autorais:

Formatos disponíveis

Analytic Functions in Oracle 8i and 9i

Analytic Functions in Oracle 8i and 9i

Overview and Introduction

How Analytic Functions Work ?

Analytic functions compute an aggregate value based on a group of rows.

http://www.akadia.com/services/ora_analytic_functions.html (1 of 17)1/22/2008 6:09:30 PM

The Syntax of analytic functions is rather straightforward in appearance

Specify the name of an analytic function, Oracle actually

Analytic functions take 0 to 3 arguments.

The PARTITION BY clause logically breaks a single result set into

The ORDER BY clause specifies how the data is sorted within

The windowing clause gives us a way to define a sliding or

http://www.akadia.com/services/ora_analytic_functions.html (2 of 17)1/22/2008 6:09:30 PM

anchored window of data, on which the analytic function will

Example: Calculate a running Total

set autotrace traceonly explain

SELECT ename "Ename", deptno "Deptno", sal "Sal",

Ename Deptno Sal Running Total Dept Total Seq

ADAMS 20 1100 9850 1100 1

ALLEN 30 1600 21225 1600 1

http://www.akadia.com/services/ora_analytic_functions.html (3 of 17)1/22/2008 6:09:30 PM

Further, we were able to compute a running total within each department,

The ROW_NUMBER() function is used to sequentially number the rows

How can we get the Top-N records by some set of fields ?

I would like the top three paid sales reps by department

The problem with this question is that it is ambiguous. It is ambiguous

http://www.akadia.com/services/ora_analytic_functions.html (4 of 17)1/22/2008 6:09:30 PM

set autotrace on explain

DEPTNO ENAME SAL TOP3

http://www.akadia.com/services/ora_analytic_functions.html (5 of 17)1/22/2008 6:09:30 PM

DEPTNO ENAME SAL TOPN

20 SCOTT 3000 1 <--- !

The DENSE_RANK function computes the rank of a row in an ordered group

http://www.akadia.com/services/ora_analytic_functions.html (6 of 17)1/22/2008 6:09:30 PM

The windowing clause gives us a way to define a sliding or anchored window

We can set up windows based on two criteria: RANGES of data values or

break on deptno skip 1

SELECT deptno "Deptno", ename "Ename", sal "Sal",

Deptno Ename Sal Sliding Total

20 ADAMS 1100 1100

http://www.akadia.com/services/ora_analytic_functions.html (7 of 17)1/22/2008 6:09:30 PM

SMITH 800 6775 \-- Sliding Window

30 ALLEN 1600 1600

COUNT(*) OVER (ORDER BY hiredate ASC RANGE 100 PRECEDING)

column ename heading "Name" format a8

SELECT ename, hiredate, hiredate-100 hiredate_pre,

http://www.akadia.com/services/ora_analytic_functions.html (8 of 17)1/22/2008 6:09:30 PM

ORDER BY hiredate ASC

Name Hired Hired-100 Cnt

We ordered the single partition by hiredate ASC. If we look for example at

Compute average salary for defined range

column ename heading "Name" format a8

SELECT ename, hiredate, sal,

Name Hired SAL Avg-100

http://www.akadia.com/services/ora_analytic_functions.html (9 of 17)1/22/2008 6:09:30 PM

ALLEN 20-FEB-81 1600 1200

set numformat 9999

ENAME HIREDATE SAL AVGASC CNTASC AVGDES CNTDES

http://www.akadia.com/services/ora_analytic_functions.html (10 of 17)1/22/2008 6:09:30 PM

KING 17-NOV-81 5000 2671 6 2392 6

Accessing Rows Around Your Current Row

Using analytic functions, this is easy and efficient to do.