Re: The Fact of relational algebra (was Re: Clean Object Class Design -- What is it?)

From: Jim Melton <Jim.Melton_at_Technologist.com>
Date: Fri, 19 Oct 2001 06:58:06 GMT
Message-ID: <Jim.Melton-7660D3.00580319102001_at_news1.denver1.co.home.com>


In article <cd3b3cf.0110172209.6320dafd_at_posting.google.com>,  bbadour_at_golden.net (Bob Badour) wrote:

> > Note
> > that at the top of this article, there is a "reference" to the article
> > you posted previously. It is NOT a pointer (there are no physical
> > addresses -- such would be impossible given the different computers and
> > news servers involved).
>
> It is a pointer. It points from your message to my prior message just
> as this message points to your message.

Oh please. By this definition, a telephone number is a pointer. Yet earlier, you used a phone number as a primary key for a join table.

> > Yet, from this reference it is possible to
> > *navigate* to your article to which I am responding.
>
> It is possible to navigage any pointer. The real problem is the
> requirement for navigation. These messages have other properties by
> which we can find messages. For instance, I can ask for all messages
> posted by "Jim Melton". Or, using a join, I can ask for all replies to
> messages posted by "Jim Melton". Or, using a transitive closure, I can
> ask for all subthreads initiated by "Jim Melton".
>
> In relational terms, the values in the headers of these messages serve
> as candidate keys for the messages and thus are useful as handles for
> the purpose of referencing the messages.

Navigation does not require a pointer (unless you want to broadly re-define a pointer as anything that can be navigated). When I pick of the phone and dial a phone number, I am "navigating" the phone number to reach the referent.

Navigation does not preclude querying. Just because an object database has references to persistent data doesn't mean that you couldn't also query on values. A reference value must be bound somehow in the first place.

You seem to like to make these concepts mutually exclusive.

> > The notion of a "handle to previously stored information" is a logical
> > concept.
>
> If the handle is a logical artifact, then it is logical. If it is a
> physical artifact, then it is physical.

Ok. It's logical.

> > It may not belong in the relational model, but it is still a
> > logical concept.
>
> What do you think "logical identity" is?

Well, given your context, it's a "handle to previously stored information"? Does that make it a pointer?

> > The physical implementation can take any number of
> > forms (including the ascii string above).
>
> The relational model requires that logical identity exist independent
> of physical implementation or physical representation.

That's nice. But this part of the thread started with Adrian's discussion of a feature of the database he is writing that doesn't claim to fully implement the relational model. So what the relational model requires is really incidental to the discussion at hand.

> > > > > Since
> > > > > everything is a pointer (and there are no pointer manipulation
> > > > > operations),
> > > > > Smalltalk
> > > > > does not have the exposures of, say, C or C++ to pointer abuse.
> > > >
> > > > Again, you miss the point. Since they are not exposed, they do not
> > > > exist.
> > >
> > > But they are exposed. As it says above, everything is a pointer.
> >
> > As it says above, pointers are not exposed. "There are no pointer
> > manipulation operations. Pointers do not have the exposures of ... C++."
>
> They are exposed. If two named variables point to the same instance, a
> message sent to one named variable that changes the state of the
> instance changes the observable state of both named variables.

I think we have entirely different notions of what it means for something to be exposed. By this defintion a FORTRAN EQUIVALENCE statement "exposes" pointers because two named variables refer to the same state value.

> > I don't mind you being dogmatic in your view, but would you please at
> > least read first?
>
> I did read. I subcribe to no dogma. I suspect you actually do mind,
> which would explain the ad hominem.

The reference which you quoted (and I carefully retained) explicitly explains how pointers are NOT exposed. This doesn't fit the point you want to make, so the mere fact that the word "pointer" is used "justifies" your (IMO) dogmatic assertion that pointers are exposed.

> > Using CORBA, for
> > example, a client holds an "object reference" to a remote object that is
> > incarnated by (lives in) some server.
>
> A client holds a pointer to a remote object. Your point?

It's not a pointer. It's a reference. There is no requirement that the server incarnating the object reference even be running at the time the operation is invoked for the operation to succeed. Invoking an operation through the reference may cause a program to be launched. That kind of precludes any physical binding on the client side.

> > The client invokes operations on
> > the reference that it has. This reference is certainly NOT a pointer, as
> > the referent may be on a different computer on a different continent.
>
> While the reference is not a memory address on the client computer, it
> most definitely is a pointer. It just happens to point to a location
> beyond the scope of the memory addresses of the client computer.

What location exactly *does* it point to?

> > In
> > fact, the referent may not actually be resident in memory in ANY
> > computer at the time the operation is invoked.
>
> But it does reside somewhere.

Unless there is a direct, *physical* mapping between the client reference and the referent, where is resides is irrelevant. This is the point of the reference (and why it is NOT a pointer).

> > Also, invoking the
> > operation on the reference causes a network connection to be established
> > (possibly being routed through several intermediaries), operation
> > arguments to be marshalled into the network byte stream, and all manner
> > of other things to happen on the remote end.
>
> These are all physical issues, which is why I say that your
> "reference" is a physical artifact equivalent to any physical pointer.

No more so than a query submitted to a dbms causes a network connection to be established between the client and the server over which the query is marshalled into a stream of bytes over the network and the result table of the query is marshalled back across the network to the client where it is bound to programming variables is a physical artifact equivalent to any physical pointer.

> > Logical == abstraction of physical (hiding implementation details).
>
> I quibble with the above equation. You are working the definitions
> backward. You must start with conceptual.
>
> Conceptual deals with the concepts. Logical deals with the logic.
> Physical deals with the physics.
>
> From the standard vocabulary for databases:

[...]

> 17.03.07 logical level
>
> A level of consideration at which all aspects deal with a database and
> its architecture, consistent with a conceptual schema and the
> corresponding information base, but abstract from its physical
> implementation.

So you agree with me then. Where's the quibble?

> > Logical allows for different physical implementations without affecting
> > the operations defined or available.
> >
> > Now, certainly a reference (as any computer concept) has one or more
> > implementations.
>
> The relational model allows for multiple implementations of its
> logical identifiers -- even within a single database. Under the
> covers, the dbms could implement logical identifiers solely as
> physical pointers. Under the covers, the dbms could implement some of
> them as pointers and some of them as relation, column name, value
> triplets.
>
>
> > One such implementation may actually be a crude pointer
> > as you like to imply. But it is not necessarily so.
>
> No matter how sophisticated, no matter how many levels of indirection
> and no matter how complex the decoding algorithm, a pointer is a
> pointer. I have implied no crudeness nor have I implied any particular
> implementation.

So because I can "navigate" from a primary key value to a table row in a database it is a pointer? I don't buy it.

> I have implied that what you call references are unidirectional,
> asymmetric and implementation dependent.

The reference we have been describing is an interface between a programming language and persistent data in a database. How would such a thing be anything other than unidirectional or asymmetric?

I can't speak to Adrian's interface, but the ODMG does define a standard interface for such references. If the interface is standard, the implementation is irrelevant.

On the other hand, references in persistent objects can certainly be bi-directional and symmetric. I'm not versed enough in ODMG standards to know if they are implementation-dependent, but I think they are.

No, what you have tenaciously (if not dogmatically) asserted repeatedly is that references are pointers. It's just not true.

> > Other
> > implementations are possible without changing the *concept* represented
> > by a reference.
>
> How many different implementations of "reference" does any given SQL99
> database implement? How many different implementations of "reference"
> does any give OOPL implement?

Irrelevant. If two different SQL99 databases have two different implementations, my point is substantiated. The same is true of OOPLs, but they are tangential to this discussion.

> > I will grant you that a reference is used to navigate directly to the
> > referent. You assert that navigation == pointer. I disagree.
>
> What does one navigate if not a pointer? How does one use a reference
> to "navigate directly" if a reference is not physical? If your dogma
> axiomatically dictates that a physical reference is not a pointer even
> in the face of all contrary evidence, nobody will never convince you.

The same way that a "primary key value" can uniquely identify a specific relation value.

> My question is not whether the object variable can simultaneously
> "live" in two places. My question is: How does it maintain its
> identity as it moves from place to place? How does it maintain the
> same identity in the spreadsheet that it has in the database?

*How* an object maintains identity is intrinsic. How do you maintain your identity when you move from Canada to the US? What you are really asking is how an object maintains identity as it is translated from one frame of reference (database) to another (spreadsheet). In computer terms, this is nearly analogous to asking how (or if) you retain your identity when you die (let's pass on the metaphysical debate).

Since I'm not aware of any spreadsheets that handle objects as objects, it would seem (to use one of your favorite evasions) that this is a straw man.

> > No matter where it lives, it maintains identity. In order for this to be
> > true, it must live in exactly one place (at any instant in time).
>
> No matter where I move, I maintain my identity. In order for this to
> be true, my identity must be independent of location. How does an
> object maintain its identity when it moves from a database to a
> spreadsheet?

You have exactly the same representation in the US and in Canada. In order to compare apples to apples, the representation of the object must be the same in the database as in the spreadsheet.

I don't have any problem positing a completely portable object reference that includes a globally unique identifier that would transcend database implementations, but I don't know of any implementation of such an identifier.

> > Not quite. This gets back to the "data copying" sub-thread we had a
> > while ago. I think your mindset is that the values from the database
> > copied into the spreadsheet have the same "identity" as the values in
> > the database. To see how this is not so, copy values into a spreadsheet
> > and then change values in the database. Since the two representations no
> > longer have the same values, they cannot have the same identity.
>
> Different values have different identity. However, if both the
> database and the spreadsheet claim to make statements about the same
> entity, they need to identify that entity. Stability is a criterion
> for selecting primary keys for this very reason.
>
> For instance, a database and a spreadsheet might make different
> statements regarding my hair colour, my height or my weight, but both
> can use the same logical identifier to identify me.

And the "object" in the database would have different identity than the "object" in the spreadsheet. This is proven by the fact that they have different state at the same time (although identical state is insufficient to prove identity).

Both of these objects "claim to make statements" about yet another object with identity, you.

> > An OID is *an* implementation of a reference. And please notice that
> > there is no singular implementation of an OID.
>
> Every ODBMS or SQL database implements at most one reference. I agree
> that no singular implementation of these pointers exists, which makes
> these physical pointers non-portable.

The *value* of a reference may not be portable, but the interface certainly can be. The ODMG defines both type-safe references (d_Ref<>) and generic referenced (d_Ref_Any). These are standardized and portable (interfaces).

Besides, as you are fond of pointing out, a spreadsheet is an application. How an application interfaces with a database is, well, application-specific.

> > But I would still maintain that a database reference would refer to the
> > actual data (in the database).
>
> Does it refer to the data in any way that a human user can understand?

A reference as we have been discussing is a programming language construct. It is not intended for human users. So what?

> Is it useful outside the scope of any given database? Can two
> databases share the same reference to identify the same conceptual
> entity? Can two different vendors' databases share the same reference
> to identify the same conceptual entity?

Interesting question. You aren't talking about the same object residing in two databases because that would violate object identity. So you must be discussing the notion of one database persisting a reference to an object in another database. I don't know of that kind of cooperation among any two vendors.

Can you construct a join between tables in an Oracle database and an Informix one?

> Can other applications use the
> same reference to identify the same conceptual entity?

Yes, absolutely. The reference is to a persistent object. As you are fond of pointing out, the database transcends individual applications.

> > > The value of a candidate key identifies both. When one combines it with a
> > > relation variable and the name of a column, it identifies an object
> > > variable
> > > in the database.
> >
> > A database reference (as Adrian originally described) embodies and
> > encapsulates this combination as a single entity that can be manipulated
> > by a programming language. There doesn't *have* to be any disagreement.
>
> The question is: Does one want to cripple the database? Does one
> prefer to empower the programming language?

Or does one want to acknowledge the reality of existing programming languages?

It's nice to dream up a new language, but until you design it and write it and get it taught and get all kinds of other vendors to support your language, you have only a wish.

> > > > Identifying the "correct" variable in "the database" is always
> > > > problematic.
> > >
> > > Not in a relational dbms. The user observes the values in the spreadsheet
> > > or
> > > on the fax and uses those values to identify variables within the
> > > database.
> >
> > Really. I supposed you've never receive e-mail for some "other" Bob
> > Badour.
>
> Never. I know another Bob Badour lives somewhere near Kingston, ON but
> I have never received any of his email. Do you receive e-mail for some
> "other" Jim Melton?

Yep. Quite frequently. Even for one within my own company.

> > Or had a problem with some vendor who mis-entered your social
> > security number (or whatever they use in Canada).
>
> A user is much less likely to mis-enter my SSN than an OID. When the
> database doesn't give any usable representation of identity, the
> database will almost certainly contain multiple variables representing
> me.

Get over it! No one except you has EVER suggested that a human user EVER see an OID. Object databases have values too, you know.

-- 
Jim Melton, novice guru             | So far as we know, our
e-mail: Jim.Melton_at_Technologist.com | computer has never had
v-mail: (303) 971-3846              | an undetected error.
Received on Fri Oct 19 2001 - 08:58:06 CEST

Original text of this message