Sunday, September 11, 2011

How to specify the Window clause (ROW type or RANGE type windows) in Analytic function?

Some analytic functions (AVG, COUNT, FIRST_VALUE, LAST_VALUE, MAX, MIN and SUM among the ones we discussed) can take a window clause to further sub-partition the result and apply the analytic function. An important feature of the windowing clause is that it is dynamic in nature.

The general syntax of the is

[ROW or RANGE] BETWEEN AND
<start_expr> AND <end_expr>

<start_expr> can be any one of the following

    UNBOUNDED PECEDING
    CURRENT ROW
   
<sql_expr> PRECEDING or FOLLOWING.
 
<end_expr> can be any one of the following

    UNBOUNDED FOLLOWING or
    CURRENT ROW or
    <sql_expr> PRECEDING or FOLLOWING.

For ROW type windows the definition is in terms of row numbers before or after the current row. So for ROW type windows
<sql_expr> must evaluate to a positive integer.

For RANGE type windows the definition is in terms of values before or after the current ORDER. We will take this up in details latter.

The ROW or RANGE window cannot appear together in one OVER clause. The window clause is defined in terms of the current row. But may or may not include the current row. The start point of the window and the end point of the window can finish before the current row or after the current row. Only start point cannot come after the end point of the window. In case any point of the window is undefined the default is UNBOUNDED PRECEDING for <start_exp> and UNBOUNDED FOLLOWING for <end_expr>.

If the end point is the current row, syntax only in terms of the start point can be can be

[ROW or RANGE] [<sql_expr> PRECEDING or UNBOUNDED PRECEDING ]

[ROW or RANGE] CURRENT ROW is also allowed but this is redundant. In this case the function behaves as a single-row function and acts only on the current row.

ROW Type Windows

For analytic functions with ROW type windows, the general syntax is:

Function( ) OVER (PARTITIN BY <expr-1> ORDER BY <expr-2> ROWS BETWEEN <start_expr> AND <end_expr>)
or
Function( ) OVER (PARTITON BY <expr-1> ORDER BY <expr-2> ROWS [ <start_expr> PRECEDING or UNBOUNDED PRECEDING]

For ROW type windows the windowing clause is in terms of record numbers.

The query Query-01 has no apparent real life description (except column FROM_PU_C) but the various windowing clause are illustrated by a COUNT(*) function. The count simply shows the number of rows inside the window definition. Note the build up of the count for each column for the YEAR 1981.

The column FROM_P3_TO_F1 shows an example where start point of the window is before the current row and end point of the window is after current row. This is a 5 row window; it shows values less than 5 during the beginning and end.

{code: ** QUERY-01 **}

-- The query below has no apparent real life description (except
-- column FROM_PU_C) but is remarkable in illustrating the various windowing
-- clause by a COUNT(*) function.

SELECT empno, deptno, TO_CHAR(hiredate, 'YYYY') YEAR,
COUNT(*) OVER (PARTITION BY TO_CHAR(hiredate, 'YYYY')
ORDER BY hiredate ROWS BETWEEN 3 PRECEDING AND 1 FOLLOWING) FROM_P3_TO_F1,
COUNT(*) OVER (PARTITION BY TO_CHAR(hiredate, 'YYYY')
ORDER BY hiredate ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) FROM_PU_TO_C,
COUNT(*) OVER (PARTITION BY TO_CHAR(hiredate, 'YYYY')
ORDER BY hiredate ROWS BETWEEN 3 PRECEDING AND 1 PRECEDING) FROM_P2_TO_P1,
COUNT(*) OVER (PARTITION BY TO_CHAR(hiredate, 'YYYY')
ORDER BY hiredate ROWS BETWEEN 1 FOLLOWING AND 3 FOLLOWING) FROM_F1_TO_F3
FROM emp
ORDEDR BY hiredate

 EMPNO  DEPTNO YEAR FROM_P3_TO_F1 FROM_PU_TO_C FROM_P2_TO_P1 FROM_F1_TO_F3
------ ------- ---- ------------- ------------ ------------- -------------
  7369      20 1980             1            1             0             0
  7499      30 1981             2            1             0             3
  7521      30 1981             3            2             1             3
  7566      20 1981             4            3             2             3
  7698      30 1981             5            4             3             3
  7782      10 1981             5            5             3             3
  7844      30 1981             5            6             3             3
  7654      30 1981             5            7             3             3
  7839      10 1981             5            8             3             2
  7900      30 1981             5            9             3             1
  7902      20 1981             4           10             3             0

  7934      10 1982             2            1             0             1
  7788      20 1982             2            2             1             0
  7876      20 1983             1            1             0             0

14 rows selected.

{/code: ** QUERY-01 ** }

The column FROM_PU_TO_CURR shows an example where start point of the window is before the current row and end point of the window is the current row. This column only has some real world significance. It can be thought of as the yearly employee build-up of the organization as each employee is getting hired.

The column FROM_P2_TO_P1 shows an example where start point of the window is before the current row and end point of the window is before the current row. This is a 3 row window and the count remains constant after it has got 3 previous rows.

The column FROM_F1_TO_F3 shows an example where start point of the window is after the current row and end point of the window is after the current row. This is a reverse of the previous column. Note how the count declines during the end.

RANGE Windows
For RANGE windows the general syntax is same as that of ROW:

Function( ) OVER (PARTITION BY ORDER BY RANGE BETWEEN AND )
or
Function( ) OVER (PARTITION BY ORDER BY RANGE [ PRECEDING or UNBOUNDED PRECEDING]

For or we can use UNBOUNDED PECEDING, CURRENT ROW or PRECEDING or FOLLOWING. However for RANGE type windows must evaluate to value compatible with ORDER BY expression .

is a logical offset. It must be a constant or expression that evaluates to a positive numeric value or an interval literal. Only one ORDER BY expression is allowed.

If evaluates to a numeric value, then the ORDER BY expr must be a NUMBER or DATE datatype. If evaluates to an interval value, then the ORDER BY expr must be a DATE datatype.

Note the example (Query-02) below which uses RANGE windowing. The important thing here is that the size of the window in terms of the number of records can vary.

{code: ** QUERY-02 ** }

-- For each employee give the count of employees getting half more that their
-- salary and also the count of employees in the departments 20 and 30 getting half
-- less than their salary.

SELECT deptno, empno, sal,
Count(*) OVER (PARTITION BY deptno ORDER BY sal RANGE
BETWEEN UNBOUNDED PRECEDING AND (sal/2) PRECEDING) CNT_LT_HALF,
COUNT(*) OVER (PARTITION BY deptno ORDER BY sal RANGE
BETWEEN (sal/2) FOLLOWING AND UNBOUNDED FOLLOWING) CNT_MT_HALF
FROM emp
WHERE deptno IN (20, 30)
ORDER BY deptno, sal

 DEPTNO  EMPNO   SAL CNT_LT_HALF CNT_MT_HALF
------- ------ ----- ----------- -----------
     20   7369   800           0           3
     20   7876  1100           0           3
     20   7566  2975           2           0
     20   7788  3000           2           0
     20   7902  3000           2           0
     30   7900   950           0           3
     30   7521  1250           0           1
     30   7654  1250           0           1
     30   7844  1500           0           1
     30   7499  1600           0           1
     30   7698  2850           3           0

11 rows selected.

{/code: ** QUERY-02 ** }

No comments:

Post a Comment