Re: A real world example

From: Brian Selzer <brian_at_selzer-software.com>
Date: Sun, 20 Aug 2006 15:21:26 GMT
Message-ID: <W5%Fg.1523$yO7.485_at_newssvr14.news.prodigy.com>

"JOG" <jog_at_cs.nott.ac.uk> wrote in message news:1156047823.641966.136920_at_m79g2000cwm.googlegroups.com...

> Brian Selzer wrote:

>> "JOG" <jog_at_cs.nott.ac.uk> wrote in message
>> news:1156039383.926159.159950_at_m73g2000cwd.googlegroups.com...
>> > Brian Selzer wrote:
>> >> It's possible for something to have its appearance altered without
>> >> altering its
>> >> essence
>> >
>> > Pish! There is no such thing as an 'essence'. See 'King Milindi's
>> > chariot' for a good explication.
>> >
>> >> and it's also possible for something to be identified by a property
>> >> that can change.
>> >
>> > Nope. Not if you want to compare it over time (which is what you are
>> > talking about). Then it's not an identifying property at all. And that
>> > sort of consideration should be occuring way before the RM is applied.
>> >

>>

>> So, what you're saying is, "Never use natural keys." Right?

>
> No.
>
>>


>> >> For example, consider a line of people at the bank.  Both


>> >> Person and Position are identifying properties.  Assume that you're 

>> >> third

>> >> in

>> >> line, so Person is you, and Position is 3.  When the guy at the head 

>> >> of

>> >> the

>> >> line leaves, your Position changes to 2.  Now let's put that in the

>> >> context

>> >> of a database.  You have a relation with candidate keys Person and

>> >> Position.

>> >> So the current instance might look something like

>> >>

>> >> {(Bob, 1), (Brian, 2), (You, 3)}.

>> >>

>> >> The proposed instance would look something like

>> >>

>> >> {(Brian, 1), (You, 2)}

>> >>

>> >> Even though Position is a candidate key in each situation and 

>> >> indirectly

>> >> identifies

>> >

>> > 'Indirect identity'? There is no such distinction to be made.
>> >

>>

>> I didn't say indirect identity, I said indirectly identifies. The
>> candidate
>> key value identifies a fact which in turn identifies a thing. There is
>> indeed a distinction.

>>


>> >> an entry in the queue, the value 2 from the current instance


>> >> identifies the tuple containing You in the proposed instance, not the 

>> >> one

>> >> containing Brian.  This illustrates the difference in the frame of

>> >> reference

>> >> for a candidate key and that for an update, and Position is an example 

>> >> of

>> >> an

>> >> identifying property that can change.

>> >

>> > Why not extend this? Perhaps brian changed his name to bob while he was

>> > waiting, and queueing positions are changed from numerical to

>> > alphabetical by the bank

>> >

>> > rv1: { (Brian, 2) }

>> > rv2: { (Bob, A) }

>> >

>> > ...and you want to automagically correlate these things? Rather than

>> > think in hindsight maybe the identifiers chosen for the entities

>> > concerned might have been a wee bit of a mistake? Does that not strike

>> > you as making more sense?
>> >

>>

>> If a key can change, it will. It doesn't matter how stable it is.
>> Choosing
>> a stable key only reduces the probability that a change will occur or
>> reduces the frequency of the changes. It does not eliminate the
>> possibility. Your example above supports my argument. Imagine a very
>> large
>> database that is updated tens or hundreds of thousands of times a day.
>> Now
>> assume that the probability of a change ocurring is .01%. This means
>> that
>> at least once a day there's a possibility of corrupting the database.
>> The
>> point, even if you can't see it, is that it is not a matter of choosing a
>> more stable key. No update should *ever* be able to violate or
>> circumvent
>> the database predicate. If an update *can* violate integrity rules, then
>> either the data model is broken or the implementation is broken. If the
>> definition of the model cannot prevent it, then the model is broken.

>
> Yes, yes, I follow your logic, but it is still flawed: one attribute
> out there will be completely stable for your specific problem space, if
> you so desire to find it. That's the nature of identity and what allows
> you identification in the first place.
>

Then the definition of the model should reflect that, but I'm not sure that surrogates are the only answer. All that is required is the ability to correlate tuples during an update. A possibility: require that the user reassert the original candidate key values during the update. The predicate of a database along with the current database instance determines the set of all possible instances that can become current. There should be a way to obtain the set of possible instances from a set of possible transitions, each of which would contain what is different on a tuple by tuple basis between the current instance and a proposed instance. A transition could be defined as a set of triples (r, t, t') where r is the name of a relation, t is a tuple from the current instance, and t' is a tuple from the proposed instance. t would be empty for an inserted tuple, t' would be empty for a deleted tuple, and neither would be empty for a corresponding tuple. I positive that I read somewhere that it is possible to transform all state constraints into transition constraints (I'll have to find out where I read this). So, given a current instance and a set of transition constraints, you should be able to construct a set of possible transitions. What I'm not absolutely sure of is whether or not there are more possible transitions than possible instances. For some reason it seems likely, unless infinity is involved. Two different possible transitions could result in the same possible instance.

I was just thinking: if more than one possible transition can result in the same possible instance, then given only the current instance and a proposed instance, can you determine which transition a transition constraint should enforce? Does it matter, that is, is it possible for one transition that determines a possible instance to be prohibited but for another transition that determines the same possible instance to be accepted? If it does matter, then the notions of relational assignment and multiple assignment are broken: updates would have to be submitted as transitions or something equivalent, not just as sets of relation values.

A light bulb just came on! Whether or not different candidate key values identify the same thing in successive database instances is something that the user must determine, not the model, but there must be some mechanism for a user to assert that, otherwise transition constraints cannot always be enforced. It is up to the user to correlate tuples, not the model, but if a transition constraint is to be enforced, then the system must be informed how they correlate, and thus the model should reflect that. Perhaps this supports Fabian Pascal's objections to variables and relational assignment.

> With a surrogate you are representing that attribute anyhow. You're
> just completely wrong to hide it from the thing that needs to use it as
> identification. (all your examples rely on an entity having some sort
> of memory of what previous state it was in, which is an incredible
> assumption)
>

I concede that it is wrong to hide it. Whether or not it's hidden diverts attention away from the main issue I'm trying to describe.

I don't think it's an assumption at all. Enforcement of any transition constraint depends not only on the definition of the constraint, but also on the current database instance in order to determine whether or not a proposed database instance should be rejected.

>>


>> >>

>> >> >>> >> > Maybe, but from a functional standpoint, that operator is 


>> >> >>> >> > just a

>> >> >>> >> > function (e.g. "subtract $500 from X), in which the balance 

>> >> >>> >> > is a

>> >> >>> >> > free

>> >> >>> >> > variable. Only in an imperative world does that involve

>> >> >>> >> > "knowing"

>> >> >>> >> > (referencing) the "previous" balance. Function application

>> >> >>> >> > means
>> >> >>> >> > there's no "query" of the value prior to the update.
>> [snip]
> Received on Sun Aug 20 2006 - 17:21:26 CEST

This message: [ Message body ]
Next message: Brian Selzer: "Re: Trying to define Surrogates"
Previous message: David Cressey: "Re: A statement on dbdebunk."
Maybe in reply to: Brian Selzer: "A real world example"
In reply to Brian Selzer: "Re: A real world example"
Next in thread: Brian Selzer: "Re: Trying to define Surrogates"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

Original text of this message