Wednesday, September 14, 2011

How to Split single Column value into multiple rows?

Consider the data set

 with t as
(
 select 101 job_id, 'P00O0496,,P00O0828,P00O2739,P00O3522,P00O4405,P00O7182,P00U1375' str from dual union all
 select 102 job_id, 'P00O0496,,P00O0828,P00O2739,P00O3522,P00O4405,P00U1375' from dual 
union all
 select 103 job_id,'UUKGQ068,UUKGQ069,UUKGQ071,UUKGQ075,UUKGQ077,
UUKGQ083,,,,UUMO12430' from dual
)

Note: Null Values should not be displayed in the result set.

The output should be as shown below

JOB_ID                 STR       
---------------------- ----------
101                    P00O0496  
101                    P00O0828  
101                    P00O2739  
101                    P00O3522  
101                    P00O4405  
101                    P00O7182  
101                    P00U1375  
102                    P00O0496  
102                    P00O0828  
102                    P00O2739  
102                    P00O3522  
102                    P00O4405  
102                    P00U1375  
103                    UUKGQ068  
103                    UUKGQ069  
103                    UUKGQ071  
103                    UUKGQ075  
103                    UUKGQ077  
103                    UUKGQ083  
103                    UUMO12430 

 20 rows selected 


There are two known ways to do this

Method 1: Using Regular expressions and Connect by clause

{Code }

select job_id, regexp_substr(yourcolumn,'[^,]+',1,r) yourvalue , r
from yourtable,
(select rownum r from dual connect by rownum <= 100) max_users 
where   
    r<= length(regexp_replace(yourcolumn,'[^,]')) +1  -- stop condition for max_users
and regexp_substr(yourcolumn,'[^,]+',1,r) is not null -- show only real values
order by job_id
             ,r  -- you need ordering by r, if you want to keep original order of your values
;

{/Code } 


Method 2: Converting data set into XML and parsing it

{Code }

 select job_id, x.str
  from t,
           xmltable('e' passing xmltype('<e><e>' || replace(str, ',', '</e><e>')    ||  '</e></e>').extract('e/e')
                       columns str varchar2(10) path '.') x
   where x.str is not null;

{/Code } 







Sunday, September 11, 2011

How to specify the Window clause (ROW type or RANGE type windows) in Analytic function?

Some analytic functions (AVG, COUNT, FIRST_VALUE, LAST_VALUE, MAX, MIN and SUM among the ones we discussed) can take a window clause to further sub-partition the result and apply the analytic function. An important feature of the windowing clause is that it is dynamic in nature.

The general syntax of the is

[ROW or RANGE] BETWEEN AND
<start_expr> AND <end_expr>

<start_expr> can be any one of the following

    UNBOUNDED PECEDING
    CURRENT ROW
   
<sql_expr> PRECEDING or FOLLOWING.
 
<end_expr> can be any one of the following

    UNBOUNDED FOLLOWING or
    CURRENT ROW or
    <sql_expr> PRECEDING or FOLLOWING.

For ROW type windows the definition is in terms of row numbers before or after the current row. So for ROW type windows
<sql_expr> must evaluate to a positive integer.

For RANGE type windows the definition is in terms of values before or after the current ORDER. We will take this up in details latter.

The ROW or RANGE window cannot appear together in one OVER clause. The window clause is defined in terms of the current row. But may or may not include the current row. The start point of the window and the end point of the window can finish before the current row or after the current row. Only start point cannot come after the end point of the window. In case any point of the window is undefined the default is UNBOUNDED PRECEDING for <start_exp> and UNBOUNDED FOLLOWING for <end_expr>.

If the end point is the current row, syntax only in terms of the start point can be can be

[ROW or RANGE] [<sql_expr> PRECEDING or UNBOUNDED PRECEDING ]

[ROW or RANGE] CURRENT ROW is also allowed but this is redundant. In this case the function behaves as a single-row function and acts only on the current row.

ROW Type Windows

For analytic functions with ROW type windows, the general syntax is:

Function( ) OVER (PARTITIN BY <expr-1> ORDER BY <expr-2> ROWS BETWEEN <start_expr> AND <end_expr>)
or
Function( ) OVER (PARTITON BY <expr-1> ORDER BY <expr-2> ROWS [ <start_expr> PRECEDING or UNBOUNDED PRECEDING]

For ROW type windows the windowing clause is in terms of record numbers.

The query Query-01 has no apparent real life description (except column FROM_PU_C) but the various windowing clause are illustrated by a COUNT(*) function. The count simply shows the number of rows inside the window definition. Note the build up of the count for each column for the YEAR 1981.

The column FROM_P3_TO_F1 shows an example where start point of the window is before the current row and end point of the window is after current row. This is a 5 row window; it shows values less than 5 during the beginning and end.

{code: ** QUERY-01 **}

-- The query below has no apparent real life description (except
-- column FROM_PU_C) but is remarkable in illustrating the various windowing
-- clause by a COUNT(*) function.

SELECT empno, deptno, TO_CHAR(hiredate, 'YYYY') YEAR,
COUNT(*) OVER (PARTITION BY TO_CHAR(hiredate, 'YYYY')
ORDER BY hiredate ROWS BETWEEN 3 PRECEDING AND 1 FOLLOWING) FROM_P3_TO_F1,
COUNT(*) OVER (PARTITION BY TO_CHAR(hiredate, 'YYYY')
ORDER BY hiredate ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) FROM_PU_TO_C,
COUNT(*) OVER (PARTITION BY TO_CHAR(hiredate, 'YYYY')
ORDER BY hiredate ROWS BETWEEN 3 PRECEDING AND 1 PRECEDING) FROM_P2_TO_P1,
COUNT(*) OVER (PARTITION BY TO_CHAR(hiredate, 'YYYY')
ORDER BY hiredate ROWS BETWEEN 1 FOLLOWING AND 3 FOLLOWING) FROM_F1_TO_F3
FROM emp
ORDEDR BY hiredate

 EMPNO  DEPTNO YEAR FROM_P3_TO_F1 FROM_PU_TO_C FROM_P2_TO_P1 FROM_F1_TO_F3
------ ------- ---- ------------- ------------ ------------- -------------
  7369      20 1980             1            1             0             0
  7499      30 1981             2            1             0             3
  7521      30 1981             3            2             1             3
  7566      20 1981             4            3             2             3
  7698      30 1981             5            4             3             3
  7782      10 1981             5            5             3             3
  7844      30 1981             5            6             3             3
  7654      30 1981             5            7             3             3
  7839      10 1981             5            8             3             2
  7900      30 1981             5            9             3             1
  7902      20 1981             4           10             3             0

  7934      10 1982             2            1             0             1
  7788      20 1982             2            2             1             0
  7876      20 1983             1            1             0             0

14 rows selected.

{/code: ** QUERY-01 ** }

The column FROM_PU_TO_CURR shows an example where start point of the window is before the current row and end point of the window is the current row. This column only has some real world significance. It can be thought of as the yearly employee build-up of the organization as each employee is getting hired.

The column FROM_P2_TO_P1 shows an example where start point of the window is before the current row and end point of the window is before the current row. This is a 3 row window and the count remains constant after it has got 3 previous rows.

The column FROM_F1_TO_F3 shows an example where start point of the window is after the current row and end point of the window is after the current row. This is a reverse of the previous column. Note how the count declines during the end.

RANGE Windows
For RANGE windows the general syntax is same as that of ROW:

Function( ) OVER (PARTITION BY ORDER BY RANGE BETWEEN AND )
or
Function( ) OVER (PARTITION BY ORDER BY RANGE [ PRECEDING or UNBOUNDED PRECEDING]

For or we can use UNBOUNDED PECEDING, CURRENT ROW or PRECEDING or FOLLOWING. However for RANGE type windows must evaluate to value compatible with ORDER BY expression .

is a logical offset. It must be a constant or expression that evaluates to a positive numeric value or an interval literal. Only one ORDER BY expression is allowed.

If evaluates to a numeric value, then the ORDER BY expr must be a NUMBER or DATE datatype. If evaluates to an interval value, then the ORDER BY expr must be a DATE datatype.

Note the example (Query-02) below which uses RANGE windowing. The important thing here is that the size of the window in terms of the number of records can vary.

{code: ** QUERY-02 ** }

-- For each employee give the count of employees getting half more that their
-- salary and also the count of employees in the departments 20 and 30 getting half
-- less than their salary.

SELECT deptno, empno, sal,
Count(*) OVER (PARTITION BY deptno ORDER BY sal RANGE
BETWEEN UNBOUNDED PRECEDING AND (sal/2) PRECEDING) CNT_LT_HALF,
COUNT(*) OVER (PARTITION BY deptno ORDER BY sal RANGE
BETWEEN (sal/2) FOLLOWING AND UNBOUNDED FOLLOWING) CNT_MT_HALF
FROM emp
WHERE deptno IN (20, 30)
ORDER BY deptno, sal

 DEPTNO  EMPNO   SAL CNT_LT_HALF CNT_MT_HALF
------- ------ ----- ----------- -----------
     20   7369   800           0           3
     20   7876  1100           0           3
     20   7566  2975           2           0
     20   7788  3000           2           0
     20   7902  3000           2           0
     30   7900   950           0           3
     30   7521  1250           0           1
     30   7654  1250           0           1
     30   7844  1500           0           1
     30   7499  1600           0           1
     30   7698  2850           3           0

11 rows selected.

{/code: ** QUERY-02 ** }

Saturday, September 10, 2011

Reset your sequence back to Zero "0"


Oracle does not have a command to reset your sequence back to its beginning value zero. Here I am giving a procedure to reset it back.

{code}
create or replace
procedure reset_sequence(p_seq in varchar2)
is
    l_value number;
begin
-- Select the next value of the sequence
 
    execute immediate
    'select ' || p_seq ||
    '.nextval from dual' INTO l_value;
 
-- Set a negative increment for the sequence,
-- with value = the current value of the sequence
 
    execute immediate
    'alter sequence ' || p_seq ||
    ' increment by -' || l_value || ' minvalue 0';
 
-- Select once from the sequence, to
-- take its current value back to 0
 
    execute immediate
    'select ' || p_seq ||
    '.nextval from dual' INTO l_value;
 
-- Set the increment back to 1
 
    execute immediate
    'alter sequence ' || p_seq ||
    ' increment by 1 minvalue 0';
end;
/
{/code}

How the script works

The script uses dynamic SQL to execute four simple steps.

Step 1 selects the next value of the sequence.

Step 2 sets a negative increment for the sequence, with value = the current value of the sequence.

Step 3 selects once from the sequence. This takes its current value back to 0.

Step 4 sets the increment back to 1.

Wednesday, September 7, 2011

What is DENSE_RANK analytic function? When it can be used?


Let's assume we want to assign a sequential order, or rank, to the payments made by customers based on paid date, we might use the DENSE_RANK function like:


with t
 as
 (
 select 1 id, 'A' name, to_date('05-Sep-11','dd-mon-rr') dt, 10 paid from dual union all
 select 1, 'A', to_date('06-Sep-11','dd-mon-rr'), 12 from dual union all
 select 1, 'A', to_date('07-Sep-11','dd-mon-rr'), 15 from dual union all
 select 1, 'A', to_date('07-Sep-11','dd-mon-rr'), 13 from dual union all
 select 2, 'B', to_date('06-Sep-11','dd-mon-rr'), 13 from dual union all
 select 2, 'B', to_date('07-Sep-11','dd-mon-rr'), 16 from dual union all
 select 2, 'B', to_date('08-Sep-11','dd-mon-rr'), 05  from dual union all
 select 3, 'C', to_date('08-Sep-11','dd-mon-rr'), 12 from dual union all
 select 3, 'C', to_date('09-Sep-11','dd-mon-rr'), 11 from dual
 )
 select id, name, max(paid) keep(dense_rank last order by dt)  last_paid
   from t
   group by id,name


ID     NAME LAST_PAID                   
-------- ---- ---------------------- 
1       A      15                     
2       B      5                      
3       C      11

Dense_Rank should be used along with FIRST or LAST functions. The KEEP keyword is for semantic clarity. It qualifies aggregate_function, indicating that only the FIRST or LAST values of aggregate_function will be returned.

Logic:
  1) Rows are grouped by ID and Name.
  2) Sorts the rows within the group. 
  3) Dense_Rank ranks the rows within the group with respect to sorted
order,say it ranks first row as "1". When there is two same dates within
the same group then Dense_Rank ranks them with the same number.
  4) Then it returns the maximum paid from the group.When there is no two rows with the same ranks within the group max/min works similar. When there is two rows with same rank then max/min fetches the corresponding value.
      
Other example: Select salary which is earned by the most employees+how often
                
with testdata as (
select 1000 as salary from dual
union all
select 1000 as salary from dual
union all
select 1000 as salary from dual
union all
select 20000 as salary from dual
union all
select 2000 as salary from dual
union all
select 10 as salary from dual
)



CNT                    SALARY                 
---------------------- ---------------------- 
3                      1000                   

Query 1

select max(count_emps) as Cnt, max(salary) keep (dense_rank first order by count_emps desc) as Salary
from (SELECT salary, COUNT(*) as count_emps
        FROM testdata
       group by salary )

Query 2

select max(COUNT(*)) cnt, max(salary) keep (dense_rank first order by COUNT(*) desc) salary
FROM testdata
group by salary;





Monday, September 5, 2011

Query to report contiguous runs of sequential numbers across rows

Suppose we have a block of data:


select 0 as n from dual
union all
select 1 as n from dual
union all
select 2 as n from dual
union all
select 3 as n from dual
union all
select 4 as n from dual
union all
select 7 as n from dual
union all
select 8 as n from dual
union all
select 10 as n from dual
union all
select 11 as n from dual
union all
select 12 as n from dual


I'd like a query output like:
first_n, last_n, count
0, 4, 5
7, 8, 2
10, 12, 3

So the first block of 5 numbers: 0,1,2,3,4 is grouped up and min/max/counted (the block has 5 numbers, first is 0 and last is 4 hence result 0,4,5).



Below query solves the problem


"select min(n), max(n), count(*) from (data_block) group by rownum-n"


First it computes "rownum-n" for all the rows in the table say (1,1,1,1,1,-1,-2,-2,-2,-2) then groups it accordingly. 


Converting Multiple Row data into single row data

In some cases we need to have a result which aggregates multiple row data into a single row while displaying the result.

For example:

     When we want to show all the employee names working under a department as a single row we can use the below queries

LISTAGG

LISTAGG function is introduced in Oracle 11g R2.
{code}
SELECT deptno, LISTAGG(ename, ',') WITHIN GROUP (ORDER BY ename) AS 
employees FROM emp GROUP BY deptno;
{/code}

    DEPTNO EMPLOYEES
---------- --------------------------------------------------
        10   CLARK,KING,MILLER
        20   ADAMS,FORD,JONES,SCOTT,SMITH
        30   ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD



 SYS_CONNECT_BY_PATH

We can use SYS_CONNECT_BY_PATH in releases prior to 11g as shown below
 
 For example consider the below data set

with t as
(select 'PFL' content, 0 i,110 rUID from dual union all
select 'LHL', 0 ,111 from dual union all
select 'PHL', 1, 111 from dual union all
select 'CHL', 2, 111 from dual union all
select 'DHL', 0, 112 from dual union all
select 'VHL', 1, 112 from dual union all
select 'CPHL', 0, 114 from dual union all
select 'WDCL', 1, 114 from dual union all
select 'AHL' ,2 ,114 from dual union all
select 'NFDL', 3, 114 from dual)

If we want the results grouped within its rUID then we can use the below query

Content                               rUID
-------------------------------     -----------    
PFL                                   110
LHL PHL CHL                           111
DHL VHL                               112
CPHL WDCL AHL NFDL                    114

{code}
select trim(sys_connect_by_path(content,' ')) as content, rUID
  from t
 where connect_by_isleaf = 1
 connect by rUID = prior rUID and i = prior i+1
 start with i = 0 
{/code} 
 
we can use WM_CONCAT(undocumented) function before 11g R2 release. 

From 9i onwards 

select 
id,
rtrim (xmlagg (xmlelement (e, value || ',')).extract ('//text()'), ',') values
from test_table
group by id;


Refer link http://www.oracle-base.com/articles/misc/StringAggregationTechniques.php for more solutions

Fetching Environment variables using SQL.

We can get the Oracle Environment variables through the view "V$parameter".

select * from v$parameter where name ='db_cache_size';


The columns
1) ISSES_MODIFIABLE VARCHAR2(5) Indicates whether the parameter can be changed with ALTER SESSION (TRUE) or not (FALSE)


2) ISSYS_MODIFIABLE VARCHAR2(9) Indicates whether the parameter can be changed with ALTER SYSTEM and when the change takes effect

How to reference two rows of data at a time in Oracle SQL?

Today I came across some interesting and useful functions in Oracle. The functions are LAG and LEAD. It is used to get the previous and next record in the table without any self join. Though these functions are there from version 8.1.6, i released its usage now only.


 ex)


01/01/2011 schema_a 100
01/02/2011 schema_a 105
01/03/2011 schema_a 115


Rate of change


01/02/2001 (105 -100) = 5
01/03/2001 (115 -105) = 10


SELECT date_column,schema_name,size_in_mb,
lag(size_in_mb) over(partition by schema_name order by date_column) prior_size_in_mb,size_in_mb - lag(size_in_mb) over(partition by schema_name order by date_column) change_in_mb FROM some_table_name




Similar functions are FIRST and LAST. Please refer http://www.oracle-base.com/articles/misc/LagLeadAnalyticFunctions.php for further details.