help with a SQL/view problem

From: Bill Ferguson <wbfergus_at_gmail.com>
Date: Mon, 15 Feb 2010 13:29:08 -0700
Message-ID: <4025610e1002151229n7a52973i92c155a5e7edc85e_at_mail.gmail.com>



First off, in case of any differences, I'm running 11.1.0.7 on Windows Server 2003.

Project 'management' wants a view to see the population of a certain schema, with the population percentage of each (data) field in a table and also the population percentage of that field in relation to to the 'parent' table.

Here is what I have so far:
CREATE OR REPLACE FORCE VIEW PERCENT_FILLED (

   TABLE_NAME,
   COLUMN_NAME,
   DATA_TYPE,
   IS_REQUIRED,
   DEFAULT_VALUE,
   NUM_DISTINCT,
   NUM_NULLS,
   LAST_ANALYZED,
   NUM_ROWS,
"Table Pct Populated",
"DB Pct Populated"

)
AS

     SELECT a.TABLE_NAME,
            a.COLUMN_NAME,
            a.DATA_TYPE,
            DECODE (a.NULLABLE,  'N', '*',  'Y', NULL) is_required,
            a.DATA_DEFAULT DEFAULT_VALUE,
            a.NUM_DISTINCT,
            a.NUM_NULLS,
            a.LAST_ANALYZED,
            b.num_rows,
            ROUND ( ( (b.num_rows - a.num_nulls) / b.num_rows) * 100)
               "Table Pct Populated",
            ROUND (
               ( (b.num_rows - a.num_nulls)
                / (  SELECT COUNT (*) FROM deposits_base))
               * 100)
               "DB Pct Populated"
       FROM    dba_tab_cols a
            LEFT OUTER JOIN
               dba_tab_statistics b
            ON (a.owner = b.owner AND a.table_name = b.table_name)
      WHERE a.owner = 'USGS' AND (a.table_name LIKE '%_BASE')
            AND a.column_name NOT IN
                   ('INSERT_DATE', 'INSERTED_BY', 'UPDATE_DATE', 'UPDATED_BY')
            AND a.column_name NOT LIKE 'SYS%'
            AND a.table_name NOT LIKE 'MLOG$_%'
            AND a.table_name NOT LIKE 'RUPD$_%'
  • and column_name not in ('MAS_ID', 'MRDS_ID') ORDER BY a.table_name, a.column_id;

So, obviously in this case 'USGS' is the schema and all of my 'data' tables that I need to report on all end with '_BASE'. I'm filtering out a few additional columns and tables as well, so I only report on the actual 'data' fields. My parent table is called DEPOSITS_BASE and my primary key field is called DEP_ID.

For the most part, the view seems to work pretty well, but I am having problems getting an accurate number for the last column ("DB Pct Populated"). The number being reported is actually the percentage of that column populated based upon the total number of rows in the DEPOSITS_BASE table, not the percentage of that column populated based upon distinct DEP_ID's.

It seems like it should be pretty straightforward, but for the life of me I haven't been able to figure it out yet (to long of a weekend?). Can somebody help me out here please? It should only require a couple of changes to run similarly on any schema you want to run it against.

Thanks.

-- 
-- Bill Ferguson
--
http://www.freelists.org/webpage/oracle-l
Received on Mon Feb 15 2010 - 14:29:08 CST

Original text of this message