Re: The Fact of relational algebra (was Re: Clean Object Class Design -- What is it?)
Date: Mon, 22 Oct 2001 10:24:29 +0200
Message-ID: <3bd3d6f1$0$283$edfadb0f_at_dspool01.news.tele.dk>
"Gary Stephenson" <garys_at_ihug.com.au> wrote in message
news:9qo1l0$o2r$1_at_bugstomper.ihug.com.au...
> Hi Jesper,
>
> First, let me state my position - I am strongly on the relational side of
> the debate. <g>
>
> Whilst I agree with _some_ of what you write, I think there needs to be a
> fundamental distinction made between "database" (or "model" or even
"dbms")
> and "application".
I agree.
> Relational theory, value semantics, and suchlike are
> applicable to the "database", whilst objects, IDs, OIDs, references and
> suchlike belong to the "application" domain. The fundamental reason for
> making such a distinction is so that the "database" can safely and
> consistently support the execution of _multiple_ applications
concurrently.
You are right, and then you're not.
I don't think it is that black and white. Roughly speaking, I'm pretty sure that your relational theory will not work but in theory unless we're talking about fairly simple and centralised OLTP system that is perfectly modelled from the start and never evolves.
As an example take something like quantum computing, it works perfectly fine in theory! However to make it work in 'real life' we need concepts such as quantum error correction. Also such error correction mechanisms are common used in biology. Real life stuff don't seem to know about mathematics. Like the real world, the behaviour of computer systems is teleonomic, they will evolve into future state and structure, which does not exist yet.
I see the OID stuff as a kind of relational theory error correction mechanism. And of course, it should be transparent to the user. That is, it is not the relational theory that has errors, but the humans that designs and uses the system -- "programming is human activities, forget that and every thing is lost" (I believe it was Bjarne Stroustrup that said something like that).
You're talking about concurrency, but the fact is that the relational theory does not have a solution to this problem (actual, it's part of the problem). Relational database is build up around this 'centralize dogma' and that's what makes them inflexible and slow.
Roughly speaking, the relational databases of today are weak because they don't take into account concepts such as theory of complexity and evolution. They are weak, because they rely on a strict mathematical theory that breaks down very easily in real life environments.
[snip]
> > In short, no one must never ever be able to change the values making up
> the
> > relational key. If they do so, the concept of strong object identity is
> lost
> > in references cached outside the database.
>
> Exactly, because the relational key _is_ the "object identity" (sic).
> Changing the key makes it an entirely _different_ "object".
Yes, and how do we control that? Your theory don't support the concept of evolution of any kind, so we really shouldn't change such key ever, which is of course impossible.
>
> > Basically everything will 'break
> > down' and that's exactly what happens in real life (everything screws
up)
> > not having any concept of strong object identity (such as system
generated
> > GUID's for example).
>
> <snip/>
>
> > Okay, the actual relational database may not break down (because of
> > constraints enforcing internal integrity of the data), however, all the
> > other systems communicating with it will. I really can't see how anyone
> > (even the most hardcore relational theorist such as you Bob) can run
away
> > from this fundamental problem.
>
> Because it's _not_ a "problem" at all. It's a feature - one of the very
> elegant and advantageous things about the relational model. If systems
> communicating with the database have been so poorly designed and
implemented
> so as to not undertand that the current state of the database is _always_
> authoritative, and that (without explicit locking protocols having been
> succesfully executed) the data contained therein may change at any time,
due
> to the operations of _other_ client systems of the dbms, then those
external
> systems are entirely at fault, and deserve to break down!
What is the reason that you're so strongly on the relational side? I really dont understand that.
I will strongly argue that of being one both sides (or many sides) and we will win this war (against complexity and stupid users :-). Being on one side, and you'll just run into a lot of trouble.
Sure, there is a lot of things in the object world and distributed computing that sucks. There is a lot of bad concepts (or as I like to call them; "ugly rules of thumbs technologies" with no dept, such as the traditional class and the faced pattern, I really hate that one, the faced pattern :-) that we need to 'clean up', get rid of, and so on.
[snip...]
> Yes, object identity etc. are all well and good for implementing an
> _application_ ("real life system"), just so long as you don't _ever_
assume
> that the database itself has to provide support for them. That is
entirely
> analagous to the "horse riding the jockey".
Hmmm, I can't see how that is possible. The two worlds must work together somehow.
I know this is a little far out, but try listen to this little story:
Let's say that the relational theory is the perfect foundation for creating models of the real world. Also, let's define a rule or statement what-so-ever saying that using the relational theory you would (in theory) be able to create all possible models in the world.
Unlike for example the object model (that have limits, because it is more specialized), no one would really be able to argue against the above statement, right? Like Alan Turing's who created this concept of 'universial machine' and claimed that his machine 'was the best' because it is able to compute every possible program you can think of given the right initial state. Nobody has (as I know) proven (at least in mathematical terms) that he is wrong.
Now, because of this incredible theory, that such a simple linear, sequential and logical machine equipped with the proper program can function as a universal computer, a lot of philosophers and also many great computer scientists and AI folks (i.e. Peter Wegner, Marvin Minsky and many more) has fiddled with the idea of such universal machine. Although, I don't know the exact conclusion of all these experiments, I think it is pretty safe to say that most people working with this thing did not find it as powerful as many mathematicians wanted it to be.
In short, all the people fiddling with this 'wonder machine' was pretty frustrated about it, (known as the Turing tar-pit) because it couldn't really be used for anything real. Also Turing himself accepted that the thing maybe was not that powerful.
This little and great paper (The Paradigm Shift from Algorithms to Interaction) from Peter Wegner explains the problem very nicely I think. http://www.jeffsutherland.com/papers/wegacm.pdf
Now, my claim is that the relational theory (with its strict constraints and it's value based object identity concept) has many of the same problems as turing machines.
One problem, is that of scalability. Because the turing machines is strict rule/algorithm-based it is not particularly speedy.
Sure, you could buy a SUN 10000 box with 64 cpu's and you'll have enough processing power for most database scenarios. But what do you do when this is not enough? Also, why pay $2 million for a ugly SUN box when a bunch of cheap linux boxes (with a right distributed object architecture) would be able to do the same job even better?
Basing any large-scale information system on a central "big brother" (such as a relational database) that is to control and oversee every single little data operation is doomed to fail! Such centralised rule-based architecture is insane because it limits the system to scale beyond the capabilities of a single computer. I'm convinced that the centralised database approach to information sharing as we know it today will very soon (hopefully) die a much deserved death.
You relational folks claim that you got the perfect mathematical model for representing and manipulating data reflecting real life entities. Obviously, you're mistaken. IMO Alan Turing's universial machine proves that.
To create large-scale distributed systems (say, interactive real-life evolving environments) we need *more* than just a bunch of static mathematical laws. For example to succeed object replication and dynamic load-distribution we need (along with a lot of other things) the concept of strong object identity. In short, we need a lot more sophisticated architecture that can adapt and take decisions at runtime what protocols and rules to use.
As Marvin Minsky once said "There is no one best way to represent knowledge, or to solve problems, and limitations of present-day machine intelligence stem largely from seeking "unified theories," or trying to repair the deficiencies of theoretically neat, but conceptually impoverished ideological positions." (From paper: "Symbolic vs. Connectionist": http://www.ai.mit.edu/people/minsky/papers/SymbolicVs.Connectionist.txt)
The theory you relational dudes have is IMO a 'relational tar-pit', a place where anything is possible but nothing of interest is practical. The harder you struggle to get any real work done, the deeper its inadequacies suck you in. But don't tell anyone that I just said that, because I'll properly get into a lot of trouble :-) Just as much trouble you guys ought to get into posting to a newsgroup named comp.object.database claiming that the concept of object identity is useless ;-)
Anyway it's an interesting discussion.
/Jesper Received on Mon Oct 22 2001 - 10:24:29 CEST