Re: Object-relational impedence

From: David BL <davidbl_at_iinet.net.au>
Date: Fri, 14 Mar 2008 21:28:16 -0700 (PDT)
Message-ID: <d62a9880-43b2-4d2a-8e0a-5d35cfd05265_at_s19g2000prg.googlegroups.com>


On Mar 15, 12:14 pm, Marshall <marshall.spi..._at_gmail.com> wrote:
> On Mar 14, 5:59 pm, David BL <davi..._at_iinet.net.au> wrote:
>
>
>
> > - the impossibility of reliable distributed transactions
>
> Are they actually impossible? I know that distributed consensus
> is impossible; Byzantine generals and all. I know little about
> transactions, but my vague impression was that 2PC was
> stronger than just an illusion.

I mean 100% reliable in the face of arbitrary network failures

Date says it like this...

"There does not exist any finite protocol that will guarantee that all participants will commit successful transactions in unison and roll back unsuccessful transactions in unison, in the face of arbitrary failures"

The proof is very simple. See "An introduction to DB systems" edition 8, page 668.

> > - the fact that synchronous messages over the wire can easily
> > be a million times slower than in-process calls
>
> This is *crucial.*
>
> > - the fallibility of distributed synchronous messages which
> > contradicts location transparency
>
> This is manageable when you combine synchronous and
> asynchronous messaging. Synchronous idempotent messaging
> that does not depend on the identity of the receiver is actually
> pretty easy; asynchronous messaging should handle as much
> as possible. Synchronous messaging that does depend on
> the identity of the receiver should be kept to an absolute
> minimum, and has to be tolerant of failure. The guidelines
> I just described might exclude certain classes of applications
> (I'm not certain) but many, many things can be done this way.
>
> Location transparency as a practical reality is achievable and
> in fact absolutely tits when done well.

I'm certainly not saying the idea of location transparency is completely worthless. Clearly there exist useful distributed state machines where location transparency has been achieved. However, there exist many more examples where in-process objects cannot be moved out of process without breaking the system (in the presence of network failure or the many orders of magnitude drop in performance). In practise location transparency has to be designed for, which in a way is at odds with its premise.

> > - how to schema evolve a distributed OO system assuming
> > orthogonal persistence and location transparency.
>
> This is one of the things that convinces me of the superiority
> of structural type systems for distributed computing. My expectation
> is that languages are going to face evolutionary pressure in
> the direction of features that are distributed- and multi-core-
> friendly.
>
> Schema evolution for nominally typed languages has proven
> to be quite brittle. I am not convinced it is *necessarily* so,
> but it begins to look like it.

I believe I understand the distinction between nominal/structural, but I don't know why you would say structural is better for distributed computing. Can you elaborate?

True orthogonal persistence implies that everything persists - even threads. In its purist form the idea is to be able to turn off a computer and later turn it on again and all threads and processes continue running as if it never was switched off. This eliminates the need for transactions and in fact for the programmer to care at all about the distinction between persistent and transient objects. This depends on finding a so called "consistent cut". This is bad enough on a single machine never mind trying to do it in a distributed system. BTW I know you are interested in lattice theory, so you might be interested to know that the consistent cuts form a lattice.

Remarkably some research projects have tried to achieve this (eg Grasshopper). However the problem of schema evolution is a show stopper. How do you evolve a state machine while it is running (or has been snapshot using a consistent cut).

> > SOA suggests that a large system should be decomposed by behaviour (ie
> > "services") which is basically an OO way of thinking. It is a flawed
> > approach to the extent that it is promoted as the main way to build
> > enterprise systems. The only proven scalable approach is to remain
> > data-centric at ever increasing scales.
>
> Mmmm, I mostly agree but maybe you said it a bit too strong.
> Datacenter-services as an approach works okay, and even
> scales, provided you don't need much flexibility and can keep
> clients and servers closely coupled. Okay come to think of it
> that's a pretty bad situation to be in. I changed my mind: you
> have a good point.

>
> > The WWW is data-centric. It is not at all surprising that Http on
> > port 80 is *much* more common than RPC, CORBA, DCOM, RMI and SOAP put
> > together. Http concerns messages used to access data instead of
> > messages used to elicit behaviour.
>
> Interesting point.
>
> My high-level viewpoint: two important success factors for
> distributed computing turn out to be<drumroll>: logical independence
> and physical independence, at the (network) protocol level!
>
> Surprise!
>
> (I apologize for the buzzword-density of this post.)
Received on Sat Mar 15 2008 - 05:28:16 CET

Original text of this message