Re: A real world example

From: Brian Selzer <brian_at_selzer-software.com>
Date: Thu, 17 Aug 2006 00:12:34 GMT
Message-ID: <SvOEg.11539$o27.9113_at_newssvr21.news.prodigy.com>


"Bob Badour" <bbadour_at_pei.sympatico.ca> wrote in message news:i6MEg.49108$pu3.577664_at_ursa-nb00s0.nbnet.nb.ca...

> JOG wrote:

>> Brian Selzer wrote:
>>
>>>"JOG" <jog_at_cs.nott.ac.uk> wrote in message
>>>news:1155662126.381260.226850_at_m73g2000cwd.googlegroups.com...
>>>
>>>>Brian Selzer wrote:
>>>>[big snips for clarity]
>>>>
>>>>>I agree that it is a violation of the information principle. I also
>>>>>agree
>>>>>that if there's no other way to distinguish things, that is, if there
>>>>>is
>>>>>no
>>>>>relevant natural key, such as for distinguishing individual quarters in
>>>>>a
>>>>>new roll, then they should definitely be exposed to end-users.
>>>>
>>>>fair enough. We are on the same page there then.
>>>>
>>>>
>>>>>I don't agree that the lack of real world identification is the only
>>>>>reason to use
>>>>>surrogates;
>>>>
>>>>It is the only logical reason though. Anything else is a kludge.
>>>>
>>>>
>>>>>I contend that the fact that that real world identification is
>>>>>not *guaranteed* to remain constant precipitates that need. If there
>>>>>IS
>>>>>a
>>>>>natural key, then as a best practice, the surrogate should not appear
>>>>>on
>>>>>any
>>>>>data entry screen or report. In that case, the surrogate is only
>>>>>needed
>>>>>to
>>>>>work around the limitations of the Relational Model, and since the
>>>>>users
>>>>>have another means of identify something, they don't have a need to
>>>>>know.
>>>>
>>>>This still makes no sense to me. In the real world if you have a value
>>>>that is insufficient to identify me over time, then it will be no good
>>>>for the database either.
>>>
>>>I think that every natural key falls into this category, because they CAN
>>>change. It may not happen often. It may not happen at all, but the fact
>>>that it CAN happen is critical, because it shifts the responsibility for
>>>maintaining integrity during updates onto either implementational
>>>extensions
>>>or application programs. I think that that violates data independence,
>>>because it ties the database to a particular implementation or
>>>application.
>
> Because the surrogate--in the sense of an arbitrarily assigned number that 
> forces uniqueness and is never reused--provides no semantic hints 
> regarding identity whatsoever, surrogates similarly force integrity into 
> applications unless one declares all candidate keys to the dbms. If one 
> declares all candidate keys to the dbms, how do Selzer's arguments 
> regarding cascaded updates change in any way shape or form? The unstable 
> keys that he ignorantly calls natural keys will still change and the dbms 
> will still have the same dilemmas.
>

Of course one must declare all candidate keys to the dbms. But if there is a surrogate key in the sense you describe, then it would be stupid to define foreign keys that reference any but the surrogate key, since it is more than stable, it is permanent (unless you're merging databases). Since surrogates don't change, there will be no cascading updates.

> If it happens that someone records duplicate information, one of the 
> surrogates must change eventually. How then does a surrogate differ from 
> any other candidate key?
>

If you declare all other candidate keys, then there won't a duplicate problem, unless there are none. And then you're in the same boat as you would be if you were using a natural key. A surrogate differs from other candidate keys in that its values are permanent (unless you're merging databases). That's the only difference.

> [remaining nonsense snipped without further comment] Received on Thu Aug 17 2006 - 02:12:34 CEST

Original text of this message