Difference between two rows

One of my recent clients had a real data twisting process to validate. It didn't help much that their source had been corrected by many "ONE-SHOTS" over the years to deal with bugs, and accounting methods mandated by Act of Congress. What we needed was a way to see changes in a stream of related rows. In the end I created for them, a solution that allowed developers to pick two rows from a table at random and compare them in SQL with a result set returned showing only differences. Its a simple thing, but kind of neat, and very useful. So I'd like to share it with you.
The basic idea is as was stated, identify a table and two rows from that table, and feed these to a function, letting it tell you the differences. As you might have guessed, a sql statement is built on the fly and executed via EXECUTE IMMEDIATE, and an Oracle Collection type is used to return a result set. Comments in the function spell out how to use it and code is attached. So lets walk through a quick example.
First we build some data.
create table cust_h
(
cust_id number not null
,row_eff_dt date not null
,row_exp_dt date not null
,cust_name varchar2(10) not null
,credit_limit number
,value_group number not null
)
/
alter table cust_h add primary key (cust_id,row_eff_dt)
/
insert into cust_h values (1001,trunc(sysdate-10),trunc(sysdate-8)-(1/(24*60*60)),'SMYTHE',0,0);
insert into cust_h values (1001,trunc(sysdate-8),trunc(sysdate-7)-(1/(24*60*60)),'SMYTH',0,0);
insert into cust_h values (1001,trunc(sysdate-7),trunc(sysdate-3)-(1/(24*60*60)),'SMYTH',10000,1);
insert into cust_h values (1001,trunc(sysdate-3),trunc(to_date('31-dec-4712'))-(1/(24*60*60)),'SMITH',20000,1);
commit
/
SQL> select * from cust_h order by 1,2
2 /
CUST_ID ROW_EFF_DT ROW_EXP_DT CUST_NAME CREDIT_LIMIT VALUE_GROUP
---------- -------------------- -------------------- ---------- ------------ -----------
1001 18-mar-2007 00:00:00 19-mar-2007 23:59:59 SMYTHE 0 0
1001 20-mar-2007 00:00:00 20-mar-2007 23:59:59 SMYTH 0 0
1001 21-mar-2007 00:00:00 24-mar-2007 23:59:59 SMYTH 10000 1
1001 25-mar-2007 00:00:00 30-dec-4712 23:59:59 SMITH 20000 1
4 rows selected.
We note the following about this data:
1) it is a customer credit file of some kind
2) it is historical in nature with contiguous non-overlapping date ranges (pretty standard)
3) a row is identified uniquely by CUST_ID,ROW_EFF_DT
4) the following basic events happened:
a) we created the customer
b) we corrected a name mis-spelling
c) we approved a credit line and assigned a starting risk level (value group)
d) we increased the credit line and corrected a second name mis-spelling
But because wide tables are not so easily deciphered for changes, it is handy to be able to see just the changes in some form. This is particularly true if we want to examine changes across a set of rows over time. So here is one possible way to do this.
SQL> break on cust_id on row_eff_dt skip 1
SQL> select a.cust_id,a.row_eff_dt
2 ,b.*
3 from (
4 select lag(rowid) over (partition by cust_id order by row_eff_dt) a_rowid
5 ,rowid b_rowid
6 ,cust_h.*
7 from cust_h
8 where cust_id = 1001
9 ) a
10 ,table(cast(kev_utils.show_column_diffs
11 (user,'CUST_H',a_rowid,b_rowid,'row_eff_dt,row_exp_dt')
12 as c_column_value_difference)) b
13 order by 1,2
14 /
CUST_ID ROW_EFF_DT COLUMN_NAME DATA_VALUE_PAIR
---------- -------------------- ------------------------------ --------------------------
1001 18-mar-2007 00:00:00
20-mar-2007 00:00:00 CUST_NAME SMYTHE/SMYTH
21-mar-2007 00:00:00 CREDIT_LIMIT 0/10000
VALUE_GROUP 0/1
25-mar-2007 00:00:00 CREDIT_LIMIT 10000/20000
CUST_NAME SMYTH/SMITH
6 rows selected.
What is going on here:
1) we are using analytics to construct a stream of rows where each row has access to the next row's rowid.
2) we are calling a function with these two related rowids to get differences
3) we are converting the function's returned object to rows, and displaying them
There are thess points of interest to interpreting this data. First understand that all we did was feed the identity of two rows to a function and it compared the rows to find differences. Second, it was our query, not the function, that has constructed a set of input data using analytics such that each row is compared to its next row. Third, in this version of the report we chose to exclude the ROW_EFF_DT and ROW_EXP_DT columns because they would always be different and thus just constitute noise.
So this report shows the changes that occurred between rows. The report reads just like the steps we saw when we dumped the table. Here they are again so you can follow the report.
4) the following basic events happened:
a) we created the customer
b) we corrected a name mis-spelling
c) we approved a credit line and assigned a starting risk level (value group)
d) we increased the credit line and corrected a second name mis-spelling
How you use this is up to you. My testers like it because it helps them figure out how composite rows built using lots of complex logic, got constructed the way they did. This in turn helps them see anomalies in the data and possible errors in the row building processes we use. See attached for full code and restrictions. We used the overloaded function call that accepts rowids.
create or replace type o_column_value_difference is object (
column_name varchar2(30)
,data_value_pair varchar2(4000)
)
/
create or replace type c_column_value_difference is table of o_column_value_difference
/
create or replace
package kev_utils
is
--
-- key_value_pair_p must look like this emp_id=1:emp_ver_no=1|emp_id=1:emp_ver_no=2
-- the routine will change single quotes to double quotes for you
-- and add the WHERE and AND components to finish up a set of where clauses
-- please note we are using the pipe (|) and colon (:) as delimiters
-- so if you are using date formats make sure not to use either in your format string
-- it is expected that each expression yeilds one row thus you should be supplying primary or unique key data
-- maybe some day I'll make this into a generice two where clause deal
--
function show_column_diffs (
owner_p in varchar2
,table_name_p in varchar2
,key_value_pair_p in varchar2
,excluded_columns_list_p in varchar2
) return c_column_value_difference;
function show_column_diffs (
owner_p in varchar2
,table_name_p in varchar2
,a_rowid_p in rowid
,b_rowid_p in rowid
,excluded_columns_list_p in varchar2
) return c_column_value_difference;
end;
/
show errors
Thanks, Kevin
- Kevin Meade's blog
- Login to post comments


Difference between two rows
Hi,
Thanks for a wonderul insight about how analytical functions can be used to a very great extent.
Thanks,
Priya.
Simply out of the box
Hi Kevin,
That was great and something out of the box ...really appreciated.
Regards
Baldeep singh
useful but
it doesn't seem to work.
I try to execute the given query but I get an error message ORA-00942 when it tries to execute the line
10 ,table(cast(kev_utils.show_column_diffs
saying that the table kev_utils.show_column_diffs does not exist.
Any idea about why does this occur?
Thank you very much.
error is telling you what problem is
either
1) you do not have a user package kev_utils
2) the package kev_utils does not have the function show_column_diffs
3) you are executing from a user who does not have the proper privileges to see or execute the routine
Figure out which one it is and fix the problem.
Good luck, Kevin
RE: Useful But
I tried this and ended up with an error:
Error at line 10
ORA-04067: not executed, package body "MARKF.KEV_UTILS" does not exist
ORA-06508: PL/SQL: could not find program unit being called: "MARKF.KEV_UTILS"
ORA-06512: at line 1
I did figure out that is because only the package spec is given here not the package body.
You need to download the attachment
You need to download the attachment
You must log in to see and download the attachment
I believe that will only see and thus only be able to download the attachement, if you are logged into OraFAQ when you are viewing this article. Create and account if you do not already have one, log in and then review the article again. You will see the attachment after the article but before any replies.
Good luck, Kevin