Re: Relation Schemata vs. Relation Variables

From: Brian Selzer <brian_at_selzer-software.com>
Date: Mon, 21 Aug 2006 04:34:13 GMT
Message-ID: <9JaGg.11737$kO3.1545_at_newssvr12.news.prodigy.com>


"Marshall" <marshall.spight_at_gmail.com> wrote in message news:1156111859.905334.214100_at_i3g2000cwc.googlegroups.com...
> Brian Selzer wrote:
>> In the context of an update, the predicate of a database along with the
>> current database state determines the set of all /possible states/ that
>> can
>> become current. Integrity rules, which are implicitly or explicity
>> specified as part of the database predicate, can be classified as either
>> state constraints or transition constraints. State constraints define
>> the
>> set of all consistent database states; transition constraints determine
>> whether or not a state change should be allowed. Given a set of
>> consistent
>> database states and the current state, one can derive a set of
>> transitions,
>> each containing what is different on a tuple by tuple basis between the
>> current state and a proposed state (any one of the consistent states). A
>> transition can be defined as a set of triples (r, t, t') where r is the
>> name
>> of a relation, t is a tuple from the current state, and t' is a tuple
>> from
>> the proposed state. t would be empty for an inserted tuple, t' would be
>> empty for a deleted tuple, and neither would be empty for tuples
>> corresponding in an update. From that set of transitions, only those
>> that
>> satisfy all transition constraints are /possible transitions/. The
>> problem
>> is that more than one transition can result in the same /possible state/
>> but
>> that not all of those are /possible transitions/.
>
> In fact, given any modestly complex schema and two database states S1
> and S2, it is possible to generate a superginormous number of different
> sets
> of transitions that would effect the transition from S1 to S2. In fact
> the
> only reason the number is even finite is because we use finite
> computers
> with necessarily finite domains; remove that constraint and I can
> construct
> an infinite number of valid transitions between S1 and S2 for a schema
> containing only a single relvar of only a single attribute.
>
>

Is the cross product of a set of attributes any less superginormous? Yet, that is how relations are defined. How about the set of consistent database states? I find this argument less then compelling.

>> For instance, consider the following states for a relation describing
>> people's marital status, and a transition constraint that says: Single
>> people can't become Divorced:
>>
>> Current Proposed
>> Jane Jones Married Jane Jones Married
>> Jane Smith Single Jane Smith Divorced
>>
>> Should the proposed state be rejected? Or did Jane Jones get Divorced
>> becoming Jane Smith at the same time that Jane Smith married Bob Jones?
>> It
>> is impossible to tell: not enough information has been provided.
>
> So apparently you have this idea that we want to model tuple identity
> 100% of the time with no exceptions, and you have this example
> which shows that it is possible to model a situation where we
> don't have tuple identity, and you expect us to see that as some
> kind of flaw. The problem is that we *don't* want tuple identity
> 100% of the time, so the above doesn't show any flaw. All it
> shows is you intentionally doing the wrong thing for the effect
> you're trying to achieve. That has all it has ever shown; that is
> all it will ever show, no matter how many times you keep posting
> the same example and pointing to it and saying "See the flaw?!"
> The flaw is only in your idea of what the RM should and should
> not be able to do. The ambiguity in your above example? It is
> *supposed* to be possible to do that. If you took away the
> possibility of doing that, it would make the model less expressive.
>

Identity beyond that provided by a candidate key (that is, a single database state) is in the eye of the beholder: whether a value means or represents the same thing in successive database states depends on the perception of the user. In addition, you don't lose any expressiveness at all by updating via transitions instead of updating via relational or multiple assignment. In fact, the effect of relational or multiple assignment can be thought of as degenerate transition forms: the transition would simply be a set of triples in which only t or t' were not empty, but not both. The point is, if the user *knows* that tuples should correspond, there's no mechanism for conveying that information using only relational or multiple assignment, and there's also no mechanism for defining transition constraints (unless perhaps, aggregate functions are involved).

> The idea of transition constraints is inherrently non-set-theoretic,
> because it assumes tuple-identity. Since it is possible to
> model non-tuple-identity scenarios in the RM, there will be
> some scenarios in which transition constraints do not make
> sense. This is not a flaw in RM; rather it is an indication that
> transition constraints are a lower-level albeit useful hack
> on top of some subset of RM schemas, that are applicable
> only some of the time.
>

I disagree. If tuple-identity were non-set-theoretic, then the Principle of Full Normalization is just a bunch of hogwash.

I prefer to use the term limitation instead of flaw. Because one cannot define enforcible transition constraints, the model is incomplete. That's not necessarily a flaw, but it certainly limits what can be done with the model. Codd defined a data model in this way:

<<<<

It is a combination of three components:

  1. a collection of data structure types (the building blocks of any database that conforms to the model);
  2. a collection of operators or inferencing rules, which can be applied to any valid instances of the data types listed in (1), to retrieve or derive data from any parts of those structures in any combinations desired;
  3. a collection of general integrity rules, which implicitly or explicitly define the set of consistent database states or changes of state or both -- these rules may sometimes be expressed as insert-update-delete rules.*

*E. F. Codd, "Data Models in Database Management," Proceedings of the 1980 Workshop on Data Abstraction, Databases and Conceptual Modeling, Pingree Park, Colorado

>>>>

It is clear from this definition that transition constraints should be definable and enforcible, and are not just a useful hack. He also said in the same document that a model without operators or integrity rules should be regarded as partial or incomplete.

>
>> Since a transition that violates a transition constraint can result in
>> the
>> same /possible state/ as a transition that doesn't, the notions of
>> relational assignment and multiple assignment are broken: an update must
>> be
>> submitted as a transition or something equivalent, not just as sets of
>> relation values, otherwise transition constraints cannot always be
>> enforced.
>
> No, that's not what it shows. (And why have you suddenly shifted from
> talking
> about insert/update/delete operations to relational assignment and
> multiple
> assignment? They have different properties, especially with regards to
> tuple identity.) What it shows is that transition constraints are not
> universally applicable, because sometimes we're not modelling
> tuple identity.
>

I wasn't talking about insert/update/delete operations. I was talking about a transition, which is a set which specifies what is different between the current database state and the proposed database state and how. For each element of the set, if t is not empty, then a proposition that was true no longer is; if t' is not empty, then a new proposition is now true. Updates, therefore, are still set-based, the only difference is that now there is enough information to enforce transition constraints if they are defined. Tuples have identity, whether you like it or not: a candidate key value *identifies* a tuple. Using transitions instead of relational or multiple assignment simply allows a user to convey the information that tuples in successive database instances correspond in some way so that transition constraints can be enforced.

>
>> So, should Relation Variables be discarded in favor of Relation Schemata?
>
> This statement doesn't even make any sense. Nor does it appear to have
> anything to do with what came before it.
>

If relational assignment is broken, then there's no need to think in terms of relation variables, because if you can't assign a value to a variable, then they are of no use; instead, you can think in terms of relation schemata.

>
> Marshall
>
Received on Mon Aug 21 2006 - 06:34:13 CEST

Original text of this message