Re: Practical considerations of dealing with two meanings of NULLs
Date: Thu, 09 Aug 2007 11:19:51 -0700
Message-ID: <1186683591.284488.182260_at_i38g2000prf.googlegroups.com>
On Aug 8, 3:58 am, "sinister" <sinis..._at_nospam.invalid> wrote:
> One simplistic method of trying to distinguish multiple meanings possibly
> associated with NULL is to associate particular non-NULL meanings with those
> values.
>
> For example, I might want to store both "unknown" and "known, but not yet
> filled in by data entry clerk." I could associate NULL to one and a
> particular non-NULL value to the other, or non-NULL values to both.
>
> If the data at issue are e.g. strings or something categorical, this is OK.
> (The only problem I can think of is the case where (in the interface---I'm
> using a webserver for that) I give users a selectbox of predetermined
> choices, and there's an associated textbox where they can put in "Other"
> (non-predetermined values)---what's to stop them from overlooking the
> predetermined choice "Unknown" and typing in "don't know" in the textbox?)
>
> But if the data are numerical, it's messy. It's easy if the data are known
> to be nonnegative, because then one can reserve particular nonnegative
> values for this (e.g. "-1 means not filled in by user," "-2 means known").
> But this method just seems ugly and kludgey to me.
>
> Is there any clean method of dealing with this?
If there are meaningful states associated with a non-value for some field, a clean solution is to have another field to store the meaningful state. For example if the numeric field was Date_Of_Birth, and a possible state (besides having a value) is "user did not specify". This doesn't scale very well as you could end up with an extra field shadowing every optional field. What you really want is a new domain for your field, one that includes all possible dates of the scalar domain (e.g. dates in my example) and that also includes your extra states. The relational model technically allows for this via user-defined domains; the only requirement is that some sort of ordering be implemented so that standard comparison operators work for your new type. Because that's a bit heavyweight, and because existing SQL databases don't support domains (user-defined or otherwise) very well, few practitioners bother to do this if the underlying scalar domain has "available space" to jam in their extra semantics. Thus, in a quantity field, if negative values are not part of the theoretical domain for the value, they would be used just as you've described, in effect getting a user-defined domain in a butt-ugly but workable manner. Received on Thu Aug 09 2007 - 20:19:51 CEST