Re: The Fact of relational algebra (was Re: Clean Object Class Design -- What is it?)

From: Jim Melton <Jim.Melton_at_Technologist.com>
Date: Tue, 16 Oct 2001 05:35:47 GMT
Message-ID: <Jim.Melton-419E09.23354415102001_at_news1.denver1.co.home.com>


In article <766y7.3062$my7.96943209_at_radon.golden.net>,  "Bob Badour" <bbadour_at_golden.net> wrote:

> > Irrelevant. For one who is keen on physical independence, you refuse to grant
> > the difference between logical and physical when it comes to object references.
> > When you say that a reference (logical construct) is a pointer (physical
> > construct) you are redefining terms.
>
> Neither a reference nor a pointer is a logical construct in the manner you
> used the term. Both are physical constructs. A relation is the logical
> equivalent of a conceptual reference.

According to you, a relation is an unencapsulated set of tuples. What does that have to do with a reference at all?

A reference is a "handle" by which information may be retrieved. Note that at the top of this article, there is a "reference" to the article you posted previously. It is NOT a pointer (there are no physical addresses -- such would be impossible given the different computers and news servers involved). Yet, from this reference it is possible to *navigate* to your article to which I am responding.

The notion of a "handle to previously stored information" is a logical concept. It may not belong in the relational model, but it is still a logical concept. The physical implementation can take any number of forms (including the ascii string above).

> > > As you can see above, in Java, a reference is a pointer to a pointer.
> > > Additional levels of indirection do not change the pointer nature.
> >
> > But they do. It is the difference between logical construct and a physical
> > implementation.

> > It is the difference between a table and a view.
>
> I agree. Both a table and a view are logical relations. Both a reference and
> a pointer are physical constructs. In terms of use, no important difference
> exists between a table and a view. In terms of use, no important difference
> exists between a reference and a pointer.

Hmmm. I would expect a table to be closer to a physical construct they way you draw the line. A view allows a table (or set of tables) to be transformed to look like a different "table" (but it is NOT a table). It is in fact indirection over the table.  

> > > Since
> > > everything is a pointer (and there are no pointer manipulation operations),
> > > Smalltalk
> > > does not have the exposures of, say, C or C++ to pointer abuse.
> >
> > Again, you miss the point. Since they are not exposed, they do not exist.
>
> But they are exposed. As it says above, everything is a pointer.

As it says above, pointers are not exposed. "There are no pointer manipulation operations. Pointers do not have the exposures of ... C++."

I don't mind you being dogmatic in your view, but would you please at least read first?

> > What
> > the compiler does "under the covers" is irrelevant.
>
> Smalltalk exposes pointers to users. All named variables are pointers and
> this is not kept under the covers.

Since there are no pointer operations, the fact that named variables are pointers are not exposed to users. Since the *language implementation* uses pointers, all named variables are not bounded in scope like, say, automatic C++ variables or local FORTRAN variables. Instead, they are of global scope and subject to garbage collection. This is an important feature of the Smalltalk language, that users do not have to worry about scope (lifetime) of variables and differentiates it from other languages (like C++ and FORTRAN). It does not expose pointers to users. It exposes named variabled to users.

> > > In C++, of course, the only differences between references and pointers are
> > > syntactic.
> >
> > Posh. There are semantic differences between pointers and references in C++.
>
> And those semantic differences would be...?

A reference can only be bound at initialization time. It cannot be "re-seated" to refer to any other object than the one it was initially bound to. A pointer may point to one object in one statement and a totally different object in the next.

A reference is also described as an "alias". It is used exactly the same as the referent, with no (user discernable) indirection. A pointer explicitly introduces indirection.

> > But Adrian was not describing a programming language pointer. He was describing
> > a reference to a persistent object (which is not intrinsic to *any* programming
> > language. So it is a new concept.
>
> The combination of a relation name, column name and candidate key value is
> also a reference to a persistent object variable. The concept is old.

And there is no programming language construct to refer to the combination of relation name, column name and candidate key value atomicaly.

Do you know anything about distributed programming? Using CORBA, for example, a client holds an "object reference" to a remote object that is incarnated by (lives in) some server. The client invokes operations on the reference that it has. This reference is certainly NOT a pointer, as the referent may be on a different computer on a different continent. In fact, the referent may not actually be resident in memory in ANY computer at the time the operation is invoked. Also, invoking the operation on the reference causes a network connection to be established (possibly being routed through several intermediaries), operation arguments to be marshalled into the network byte stream, and all manner of other things to happen on the remote end.

A reference can be quite a bit more than a pointer.

> The
> physical construct Adrian describes is old too: it's called a pointer.

If you don't have the experience to see the difference, that's OK. But you might want to ease up on the dogma a bit.

> > It has nothing to do with the logical concept.
>
> I believe you are mixing models. What you call a reference is a physical
> equivalent to the reference concept. A pointer is also a physical equivalent
> to the reference concept and equates to what you call reference. The logical
> equivalent to the reference concept is a relation.

OK. I'll define my terms so you know what I mean.

Physical == implementation. A physical pointer points to an address of memory or a sector of a disk, etc.

Logical == abstraction of physical (hiding implementation details). Logical allows for different physical implementations without affecting the operations defined or available.

Now, certainly a reference (as any computer concept) has one or more implementations. One such implementation may actually be a crude pointer as you like to imply. But it is not necessarily so. Other implementations are possible without changing the *concept* represented by a reference.

I will grant you that a reference is used to navigate directly to the referent. You assert that navigation == pointer. I disagree.

> > Nope. I'm just saying that EITHER the object "lives" in your database OR it
> > "lives" in your spreadsheet. It cannot "live" in both places.
>
> Why not? I live in both Canada and the USA. Do I change my identity when I
> cross the border? If the lifetime of the object entails multiple locations,
> your previous statement implies that identity should not change when the
> location changes. The spreadsheet is just another location.

At any instant in time, you occupy exactly one spot in space. You "live" there. Sure your object can live in both the spreadsheet and the database. Just not simultaneously. That would be like you being in Calgary and New York at the same time.

No matter where it lives, it maintains identity. In order for this to be true, it must live in exactly one place (at any instant in time).

> > Otherwise, you
> > would have two objects of equivalent state (at some snapshot in time).
>
> Do you realise how fundamentally and how thoroughly you have just impeached
> OID or "reference"?

Nope.

> You are saying that OID provides an identity for variables and not for data.
> The variable in the database is different from the variable in the
> spreadsheet, and I would agree. However, both variables describe a single
> real-world entity with unique identity.

Not quite. This gets back to the "data copying" sub-thread we had a while ago. I think your mindset is that the values from the database copied into the spreadsheet have the same "identity" as the values in the database. To see how this is not so, copy values into a spreadsheet and then change values in the database. Since the two representations no longer have the same values, they cannot have the same identity.

If, however, your spreadsheet had an "active link" (reference) into the database, changes in the database values would automatically be reflected in the spreadsheet. Object identity is preserved (the object lives in the database) and the spreadsheet uses the reference.

By the way, Excel already works like this. I've had spreadsheets e-mailed to me with references to external data that did not make the trip. The links are then broken.

> You are saying that OID does not provide any facility to identify the actual
> data or real world entity.

Real-world entities don't often lend themselves to computer representation.

An OID is *an* implementation of a reference. And please notice that there is no singular implementation of an OID.

But I would still maintain that a database reference would refer to the actual data (in the database).

> The value of a candidate key identifies both. When one combines it with a
> relation variable and the name of a column, it identifies an object variable
> in the database.

A database reference (as Adrian originally described) embodies and encapsulates this combination as a single entity that can be manipulated by a programming language. There doesn't *have* to be any disagreement.

> A spreadsheet identifies its variables using a coordinate
> scheme.

Yes, and if the row/column headings are not printed out, you cannot identify that variable from a printout.

> Even in the spreadsheet or on a fax, the candidate key value
> identifies the same entity in the real world.

Well, actually the fax captures state of some entity at some point in time. There is nothing intrinsic in the spreadsheet printout to tell you what the candidate key value is.

And the "real world" is useful for object modelling, but models are necessarily approximations of the "real world", so the fax may or may not identify anything in the "real world".

> > Identifying the "correct" variable in "the database" is always problematic.
>
> Not in a relational dbms. The user observes the values in the spreadsheet or
> on the fax and uses those values to identify variables within the database.

Really. I supposed you've never receive e-mail for some "other" Bob Badour. Or had a problem with some vendor who mis-entered your social security number (or whatever they use in Canada). Or received a catalog with your name misspelled.

I will grant you that in theory all relations are unique. But from a spreadsheet fax or a corporate e-mail directory or a bulk-mail database sometimes the wrong values are associated with the wrong "real world" entity. Values are subject to error. References are explicit in their referent.

> > > Can you believe that people accuse ME of platonic idealism???
> >
> > No, but you've sure got a bee in your bonnet about OIDs and human consumption.
>
> Bee or no bee, humans consume identity information.

Humans consume values. They attempt to derive (sometimes incorrectly) identity from them. If your only concern is queries and spreadsheet displays, you have little use for references. There are other uses for databases.

> > How does a relational database "expose objects directly"?
>
> As variables in relation variables uniquely identified by a combination of
> relation variable name, candidate key value and column name. Or as values in
> relations uniquely identified by a combination of relation, candidate key
> value and column name.
>
>
> > Much as you might wish otherwise, the current state of the art is that not all
> > things can be accomplished via set operations internal to the RDBMS. For those
> > cases, database values (since there is typically NOT a tight language binding
> > to allow direct access to variables -- except in object databases) must be
> > "exposed" to a programming language. How is the encapsulation of the object
> > maintained then?
>
> One must construct an application programming language object variable based
> on the observable properties of an object value from the database. To update
> the database, one must change the value or state of a database object
> variable based on the observable properties of an object value in the
> application.

How is it possible to construct a programming language object variable based only on the observable properties of an object value from the database. In order to have the same representation as the database value, doesn't the programming language object need access to the (hidden) internal state of the database object?

> How is encapsulation violated?

By this exposing of hidden state of database objects.

-- 
Jim Melton, novice guru             | So far as we know, our
e-mail: Jim.Melton_at_Technologist.com | computer has never had
v-mail: (303) 971-3846              | an undetected error.
Received on Tue Oct 16 2001 - 07:35:47 CEST

Original text of this message