Re: A real world example

From: Brian Selzer <brian_at_selzer-software.com>
Date: Thu, 17 Aug 2006 03:05:03 GMT
Message-ID: <z1REg.8554$%j7.2710_at_newssvr29.news.prodigy.net>


"J M Davitt" <jdavitt_at_aeneas.net> wrote in message news:qFPEg.76061$Eh1.21923_at_tornado.ohiordc.rr.com...
> Brian Selzer wrote:

>> "JOG" <jog_at_cs.nott.ac.uk> wrote in message 
>> news:1155685450.575606.117120_at_b28g2000cwb.googlegroups.com...
>>
>>>Brian Selzer wrote:
>>>
>>>>"JOG" <jog_at_cs.nott.ac.uk> wrote in message
>>>>news:1155662126.381260.226850_at_m73g2000cwd.googlegroups.com...
>>>>
>>>>>Brian Selzer wrote:
>>>>>[big snips for clarity]

> [more snips for more clarity]
>
>> Note: I categorically reject the notion that the application should issue 
>> key updates separately.  If the system allows ad-hoc queries (and most 
>> do), then one could issue such a change, bypassing the application.
>

> I think the "bypassing the application" concern reveals
> more of your perspective than everything else you've
> written so far. The fact that you somehow see tension
> between application requirements and data requirements
> and seem to regard the application as more important
> indicates shortsightedness.
>

> (The terms are mine and I apologize if my use of the
> terms is a mischaracterization of Brian's thoughts.)
>

One argument that I've heard in the past is that key changes should be segregated. The only way to do that is in an application; hence the point. As far as tension, it's there, but not in the way you've characterized. I think that the data requirements are more important: if the information can't be relied upon, then it doesn't matter how good an application is.

>>>It is important to see that "Me now" is a completely different entity
>>>to "Me over my whole lifetime". The temporal issue is irrelevant, all
>>>that matters is to recognise they are just different entites. I know
>>>this is initially seems an obtuse philosophical point, but it has
>>>_real_ consequences for how to model those entities.
>>
>> I see the difference, but I can't see how you can shrug off the temporal 
>> issue.  They must have a common property because they're related.  I 
>> would argue that "Me now" is part of "Me over my whole lifetime."
>

> If some sort of temporal continuity is important, the
> design must include features that meet the requirement --
> but some naively suggested surrogate key isn't the answer...
>
If you've read the rest of my argument, then I ask you: how can you possibly use a temporal attribute to make the tuples in the database instance preceding a change correspond to tuples in the database instance succeeding a change?
> [snip]
>
>> In an earlier post, I used the terms, "individual" and "universal" to 
>> describe these categories of properties.  Individual properties define 
>> the essence of something and remain constant throughout its lifetime. 
>> Universal properties depend on the state of the universe during that 
>> lifetime.  Universal properties are the only ones that can change.
>

> ..even if you proclaim that your made-up value is an
> "individual property."
>

> All this...
>
>>>>>>Here's a simple example of what can happen:
>>>>>>
>>>>>>create table P
>>>>>>(
>>>>>> x int primary key,
>>>>>> y int
>>>>>>)
>>>>>>create table F
>>>>>>(
>>>>>> x int primary key references P(x),
>>>>>> z int
>>>>>>)
>>>>>>insert P (x, y) values (1, 1)
>>>>>>insert P (x, y) values (2, 3)
>>>>>>insert P (x, y) values (3, 2)
>>>>>>insert F (x, z) values (1, 5)
>>>>>>insert F (x, z) values (2, 4)
>>>>>>select * from P join F on (p.x = F.x)
>>>>>>update P
>>>>>> set x = case x
>>>>>>    when 1 then 2
>>>>>>    when 2 then 1
>>>>>>   end
>>>>>> where x in (1, 2)
>>>>>>select * from P join F on (p.x = f.x)
>>>>>>
>>>>>>RESULTS:
>>>>>>
>>>>>>Before:
>>>>>>x           y           x           z
>>>>>>------ -------- -------  --------
>>>>>>1           1           1           5
>>>>>>2           3           2           4
>>>>>>
>>>>>>After:
>>>>>>x           y           x           z
>>>>>>------ -------  -------- --------
>>>>>>1           3           1           5
>>>>>>2           1           2           4
>>>>>>
>>>>>>
>>>>>>Should the new values for x in P have cascaded into F?  Assume that 
>>>>>>the
>>>>>>system only has the before image and the after image (the first two
>>>>>>columns
>>>>>>in RESULTS) in order to complete the update.  How can such a system
>>>>>>differentiate between the above update and the following update?
>>>>>>
>>>>>>update P
>>>>>> set y = case y
>>>>>>    when 1 then 3
>>>>>>    when 3 then 1
>>>>>>   end
>>>>>> where x in (1, 2)
>>>>>>
>>>>>>Clearly first update affects a key, and consequently, the changes 
>>>>>>should
>>>>>>cascade, but with the information available (both updates produce the
>>>>>>exact
>>>>>>same before and after images), the system cannot differentiate between
>>>>>>the
>>>>>>two updates; therefore, it cannot determine whether or not to cascade 
>>>>>>the
>>>>>>changes.  Updates within the Relational Model are are constrained in 
>>>>>>the
>>>>>>same way as this hypothetical system.  All that is available is the
>>>>>>preceding instance and the succeeding instance, and if the only key 
>>>>>>can
>>>>>>change, then there is no way to correlate tuples.
>

> ..doesn't describe a general case. Without specific
> requirements, one can't say "Clearly...the changes
> should cascade." Yes, "[u]dates within the Relational
> Model" occur as you depicted -- but whether changes to
> any value is appropriate or not has nothing to do with
> relational theory. To me, it seems that you're trying
> to invent requirements for the model because you can't
> see how to fulfill them in your design.

That's not the case at all. I see exactly how to fulfill them. The point is: I shouldn't have to. I should be able to define declarative constraints to ensure the integrity of the data. That means not only within each database instance, but also between two successive database instances because the model allows a database to be updated and describes how it happens, that is, updates are set-based. Now, if an update involves more than one tuple, and that update depends on the current database instance, then there should be a way to correleate the tuples in the current database instance with those in the proposed instance. I've shown that candidate keys as defined are not sufficient. Can you think of another way? I shouldn't be able to compromise integrity with an ad-hoc update. Received on Thu Aug 17 2006 - 05:05:03 CEST

Original text of this message