Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
![]() |
![]() |
Home -> Community -> Mailing Lists -> Oracle-L -> Re: artificial values vs null
Hi Niall,
I've seen on quite a number of occasions the damage that "artificial" values can do in stuffing up Oracle's assumption of range value distributions and hence Oracle's ability to select an appropriate execution plan.
Setting end dates to some distant future is full of dangers and is a classic example.
Beware indeed !!
Cheers
Richard
----- Original Message -----
From: "Niall Litchfield" <niall.litchfield_at_gmail.com>
To: <nsouto_at_bizmail.com.au>
Cc: <oracle-l_at_freelists.org>
Sent: Friday, September 24, 2004 7:37 PM
Subject: artificial values vs null
On Thu, 23 Sep 2004 21:21:40 +1000, Nuno Souto <nsouto_at_bizmail.com.au>
wrote:
> Whatever the optimiser may do with 31 dec 2099,
> it will be a darn long shot better than a NULL value
> that can't be indexed...
you may regret making that statement.... see below.
>
> Still, the correct design is to add a "status" column
> where the "current" can be indicated. The high date
> is a poor substitute. But still better than having to do
> range or full table scans to find the current row of a
> particular category.
For this purpose I entirely agree. Though in the case of current 'logins' you can bet your bottom dollar that a crashed session will still be shown as current when it isn't. At least then you have demonstrably incorrect data rather than assigning meaning to meaningless data :)
I have created a script to demonstrate what the 'optimizer' does (10g) with the use of various strategies for assigning an artificial value for what should be a NULL. It isn't pretty. You can see the full results at my site
The script is available at
http://www.niall.litchfield.dial.pipex.com/scripts/tests/date_distribution.s
ql
and you are welcome to play around with it on test systems yourself.
What I find is that *in the absence of histograms* then using an artificial value - especially if you do the other thing app designers do and use the wrong datatype - gives you woefully wrong estimates of cardinalities. This may or may not matter too much in the case of a simple single table query, but if the resulting rowset will be used in a join and the stats for cardinality can be wrong (as they are in at least one of my cases) by an order of magnitude then what the CBO will do will most likely not be sensible at all. histograms do fix this, but at a non-zero cost.
I do of course have to admit that my range scan was chosen not just because it runs from christmas to one of my best friends birthdays - but also because it is a good way to illustrate the perils of bad datatypes.
-- http://www.freelists.org/webpage/oracle-lReceived on Fri Sep 24 2004 - 05:35:01 CDT
![]() |
![]() |