Re: NULLs: theoretical problems?
Date: Sat, 18 Aug 2007 08:56:16 GMT
Message-ID: <Quyxi.46$6h3.31_at_trndny05>
"David Portas" <REMOVE_BEFORE_REPLYING_dportas_at_acm.org> wrote in message
news:l_Cdnbfs5KQfllvbRVnytgA_at_giganews.com...
> SELECT DISTINCT i FROM t WHERE x = x;
> "David Cressey" <cressey73_at_verizon.net> wrote in message
> news:UGmxi.11$dz3.4_at_trndny01...
> >
> >> If those are the ONLY propositions that you intend to represent with
your
> >> database then you might have a point. The theoretical abhorrence arises
> >> as
> >> soon as we try to derive other propositions from such a database. Then
> >> three-value logic means that true propositions in the real world are no
> >> longer true and false ones are no longer false.
> >
> > In theory, you are right. In practice, it is really quite simple to
> > avoid
> > queries that can be tripped up by three valued logic.
>
> That's a startling claim. For three reasons I disagree.
>
> Firstly, annecdotal evidence suggests exactly the opposite: developers
have
> enormous trouble with nulls. They get tripped up all the time with
incorrect
> results in queries, either because they forgot to consider the null case
or
> because they didn't work out the three-value predicates properly. I'm a
very
> experienced SQL developer and I still make those kinds of mistakes too
> often.
>
> Secondly, even for the perfect programmer this is demonstrably not a
simple
> problem. Determining whether some predicate is a tautology is known to be
> NP-hard. Nearly all tautologies (in two-value logic terms) produce
> potentially incorrect results under three-value logic. For example:
>
> SELECT DISTINCT i FROM t WHERE x < 0 OR x >=0;
>
> is not equivalent to:
>
> SELECT DISTINCT i FROM t;
>
> Thirdly, even the validity of the logical operators themselves is in
> question under three value logic. That's because their "correct" operation
> actually depends on what interpretation a user attaches to a null. Yet
there
> is no universally accepted interpretation for nulls! Codd thought there
were
> two different meanings for null "marks" - but he was wrong. There are many
> more than 2 meanings and I strongly suspect that NONE of them will give
good
> results for all possible queries. The simplest example of all, consider:
>
> SELECT DISTINCT i FROM t WHERE x = x;
>
> The result is clearly wrong if a null means "x is unknown" or "x is out of
> range" and possibly wrong if null means "x does not apply".
>
> If you admit nulls then you must deal with predicates that reference
> nullable attributes. I know of no foolproof way to ensure that those
> predicates give the correct results.
>
> >
> > This is even true for queries that may contain some NULLs in the result
> > table. Just as well, because when doing outer joins, one would come up
> > with NULLS in the result table, even if the base tables were designed in
> > such a way that NULL need never be permitted to be stored.
> >
>
> Outer join is only a syntax shortcut for a union. Any system that has an
> outer join syntax ought to have some way to specify the values used to
fill
> in the gaps. SQL does: the COALESCE function.
>
> --
> David Portas
>
>
My own anecdotal evidence suggest that it is, in reality, quite simple, at
least in the cases I have dealt with. It is entirely possible that a great
many developers have a great deal of difficulty with something that is quite
simple. I've seen that over and over again in computing.
In situations where I've observed other programmers having trouble with 3-valued logic, it nearly always boils down to using possibly omitted values in a WHERE clause. In simple programming situations, the only selectivity required in WHERE clauses deals only with cases where there is a value to compare.
> Outer join is only a syntax shortcut for a union. Any system that has an
> outer join syntax ought to have some way to specify the values used to
fill
> in the gaps. SQL does: the COALESCE function.
That assumes that the "gaps" are places where data is missing and there "ought to be data there". Sometimes, a gap is just a gap. Hugo Kornelis has outlined an excellent summary in which the development of "-" in paper and pencil data, or NULL in an SQL system might represent merely the difference between the rectangular form of a result table, and the non rectangular form of the data being sought. If you assert that meaningful data being sought is always in rectangular form, then you and I think differently. Received on Sat Aug 18 2007 - 10:56:16 CEST