Re: Newbie question about db normalization theory: redundant keys OK?

From: David BL <davidbl_at_iinet.net.au>
Date: Sun, 30 Dec 2007 21:09:36 -0800 (PST)
Message-ID: <9e1b4a3f-c3f1-4b14-af76-dbf5396cbfbb_at_i29g2000prf.googlegroups.com>


On Dec 30, 4:23 am, Sampo Syreeni <de..._at_iki.fi> wrote:
> On 2007-12-26, David BL wrote:
>
> > But can you provide a realistic example, and offer a "proof" that
> > message queues are inadequate?
>
> I've given an example where the transaction both can be bounced by the
> target system because it violates a local integrity constraint, and also
> cannot be rolled back because acceptance of the effects of the
> transaction on the source system carries irreversible external effects.
> In such a situation you have no alternative but to obtain agreement from
> the target system *before* you return with success, which implies the
> use of a distributed agreement protocol. To me that's proof enough.

Are you talking about the example of the claim being allocated to a handler? See below.

> > When you say "Y is cancelled" do you mean an account on machine Y is
> > being closed?
>
> Yes.
>
> > 2. Given that peers only find out asynchronously that the account is
> > being closed there will be some period of time for which deposits
> > continue to be made into the account. Y must continue pumping messages
> > from its peers. [...]
>
> Then the "close account" transaction cannot return with success until
> you have a guarantee that no message queue in the system contains a
> deposit message aimed at the account.

There is no such thing as a distributed close account transaction. This is an advantage of the asynchronous approach - because it promotes autonomy. In step 1 a local transaction on Y marked the account as closed and posted "close account" messages to local queues. So closing the account is fast and infallible - a least in the sense of being independent of network failures to the peers. This for example makes it easier to build clean and simple user interfaces. A user who wants to close an account on Y doesn't have to be concerned with whether Y can communicate with some computer X at that point in time.

> In order to have that guarantee,
> you either have to obtain distributed agreement using some explicit
> commit protocol, or to have an implicit acknowledgement from all of the
> peers, say by lower bounding the submission time of the transaction
> they're now processing above the time your deposit message would have
> arrived. Unfortunately the latter option forces much tighter coupling on
> the systems than the former, because you have to have communication with
> all of the peers, not just those which actually participate in a
> transaction with the account. (Commit protocols communicate the
> participation information lazily and selectively by contacting peers
> when the transaction arrives. The second option would seem to dictate
> eager broadcasting of the information.) You'll also have to deal with
> distributed clock synchronization, which is a nasty problem in its own
> right.

I can't see why anything needs to know when the account has been "truly" closed (ie in step 4) other than internal cleaning tasks (similar to how operating systems need to close sockets after they appear to have been closed by application code in order to meet the requirements of the protocol).

If you use 2PC to perform transfers between accounts and to close accounts, you need to implement a distributed lock manager. You need to detect distributed dead-locks. I realise there is literature on this but the field is horribly complex and the proposed solutions have many shortcomings. I don't agree that the synchronous approach has less coupling when you account for all the overheads - the machines need to be coordinated for lock management, dead lock detection and of course the 2PC itself. Furthermore when there are network failures it is far more likely for the whole system to come to a grinding halt.

Systems using 2PC need complex user interfaces to allow administrators to manually fix transactions that got confused.

> > 3. It is assumed that each peer (in a local transaction) eventually
> > reads the message that the account is being closed. This stops it from
> > performing any further deposits into the account. [...]
>
> If that is the case, then the peers have to remember which accounts on
> the other machines have been closed. That is, at worst all of them have
> to maintain state for all of the accounts in the system. That is
> obviously not a good idea in a distributed system.

This is a good point. However it is easy to overcome. Consider that computer X tries to deposit money into an account on Y that no longer exists. Then X will have posted a message on a local queue to be read by Y. When Y eventually reads the message it will know it cannot support the deposit so it simply reverses it by posting a deposit back to X.

Looking at steps 1-4 I now notice that Y really can treat the account as closed in step 1 and delete it from the system. There is no difficulty with Y reversing all subsequent attempts to deposit into that nonexistent account.

> > There should be a concept of a claim that is not currently allocated
> > to a handler, and the claim is passed around the network until it is
> > assigned to a handler. Allocation of a claim to a handler is a local
> > decision using a local transaction allowing the legal obligation to be
> > enforced.
>
> If you allocate the claim to a handler, it might be that there is a
> reassignment message in one of the remote queues already destined for
> that handler.

Not possible! Let me rephrase in more detail. Firstly I'll describe a useful design pattern:

  Problem:
    Achieve global consensus on the one and only location     of a token in a distributed system. More precisely, ensure     that exactly one computer knows it has the token at a     given time.

  Assumptions:
    Arbitrary network failures and changes to network     topology. Computers support local transactions that     feature atomicity and optional durability.

  Solution:
    Each computer persists a flag for whether the token     is currently allocated to that computer.

    Each computer persists a local queue for each remote     computer to which it will post messages.

    Each computer persists a sequence number for each     remote queue that it pumps.

    Initially the token is allocated to exactly one computer

    Token is moved to another computer as follows

  1. Only the computer that currently has the token is allowed to commit a local transaction that atomically
  2. clears its local flag; and
  3. pushes a message onto a local queue that represents transfer of the token to another machine.
  4. When a computer reads a remote message denoting receival of the token it commits a local transaction that atomically
  5. sets its local flag; and
  6. increments sequence number to indicate message has been processed.

Note that the token is pushed around, not pulled. Nevertheless, orthogonally to this pattern there can be "pull" messages representing requests that the token be pushed in a particular direction.

Now getting back to your problem of allocating claims to handlers...

We use the above design pattern to allocate the claim to a machine without implying that it has yet been allocated to a handler. A machine has no constraint on the number of claims it has been allocated, and it is always able to push claims out to its peers as it sees fit.

According to the pattern, a claim will be allocated to exactly one machine at a given time. Only that machine can choose to allocate the claim to a local handler using a local transaction that meets the legal obligation.

> If that second claim then pushes the number of claims for
> this handler past ten, one of the allocations was an irreversible
> mistake. Hence, you cannot allocate and/or reallocate a claim before you
> have a guarantee that this won't happen in the future. You'll again need
> distributed agreement. Of course you *could* wait until you can be sure
> that the allocation can no longer bounce, but then you'll again have the
> issues with clock synchronization, excessive coupling and so on, and in
> an online system you can't really have guarantees about the future
> unless you actually quiesce everything but this one transaction. You
> can't seriously be thinking of anything like that, can you?

Is my solution valid?

> > A 100% guarantee of global consistency is provably not possible and
> > multiphase commit protocols can get it wrong under certain (perhaps
> > unlikely) failure conditions.
>
> Of course. You can't get something for nothing, so you usually have to
> make some assumptions about how the underlying system behaves. But then,
> under the usual assumptions like nonpartition of the network etc., you
> *can* guarantee quite a lot using just 2PC.

The following MSDN article seems typical of 2PC nastiness.

    http://msdn2.microsoft.com/en-us/library/ms681185(VS.85).aspx

Too many demands are placed on the poor old DBA.

> > It is far from clear to me that message queues cannot provide similar
> > statistical guarantees.
>
> Perhaps, but thus far you haven't been able to show how they would
> accomplish even the kind of guarantees the usual commit protocols do in
> my example case, with lower latency, more permissive assumptions about
> the lower level system, or whathaveyou.

What example case are you referring to?

It seems to me you evaluate in terms of synchronous requirements. Eg you above remarks on closing a bank account. When applied consistently the asynchronous approach is far less likely to be judged by network latency, and instead by total throughput.

Done properly message queues don't make *any* assumptions about the nature of network failures, in contrast to 2PC/3PC.

> > How many programmers properly understand how 3PC avoids most but not
> > all blocking that can occur in 2PC?
>
> Application programmers are not *supposed* to understand that sort of
> thing. The very reason there is such a thing as a transaction
> abstraction, with its usual ACID properties, is that you can hide the
> complex implementation details behind it, and let the presumably more
> knowledgeable database implementers take care of them, once and for all.

I don't believe it's reasonable to hide the distributed nature. There is too big a hit on performance, correctness and robustness.

> > The recovery mechanisms for these protocols are horribly complicated.
>
> Well, just considering cascading aborts, message queues can behave
> terribly badly in that sense as well.

In this case I was only talking about complexity of the implementation.

> > There are a number of papers about the impossibility of properly
> > defining global properties of a distributed system.
>
> Yes, but that sort of trouble can be cut through by making assumptions.
> At least in that case you can get conditional certainty, and solid
> grounds for probabilistic reasoning based on those conditions.

I'm interesting in the mathematics behind distributed computing and regard the (simple) proof that distributed transactions cannot be immune to arbitrary network failures as deeply troubling. To my mind distributed transactions don't represent a mathematically correct "atom" for decomposing a system.

> > Paradoxically, the asynchronous approach is so much more efficient it
> > will reach quiescence far more quickly than any 2PC approach could
> > hope for.
>
> I don't think quiescence is the aim in an online environment. And as I
> said, when you can do with less, you should have the option; there is
> nothing wrong with your approach where it *does* apply. It's just that
> it's not general enough.

That may be the case, but I want to see a compelling example before I'll agree it is not general enough.

> >> Even this is not really true: the new state of the queue still needs to
> >> be forced into stable storage before the transaction can safely return.
> >> Otherwise a crash before the message insertion is made persistent would
> >> cause an inconsistency between the states the client and the DBMS see.
>
> > No the queue is part of the local DB state. Local transactions ensure
> > atomicity across all local DB state irrespective of durability.
>
> Precisely my point. The local transaction needs to cover the durable
> update of the message queue, so it also needs to force the update into
> stable storage before it can return. As a matter of fact, it probably
> shouldn't return before the system is certain that the transfer can no
> longer bounce from the target system.

I don't agree. Are you confusing atomicity and durability? They are orthogonal concepts. The only requirement is atomicity.

> > The need to force the log relates to a requirement of durability on
> > local messages in a queue that are asynchronously posted to a peer.
>
> But it also relates to durability as the user initiating the transaction
> sees it. Either your terminal says the transaction was aborted or it
> says it committed. The client/terminal/user is part of the transaction,
> and returning with success means the effects must have been made
> durable. Otherwise the user/client is in an inconsistent state with
> respect to the rest of the system.

Yes, that is another reason why a local transaction may need to be durable.

My point was that part of 2PC involves flushing of the log as part of the protocol, and the corresponding durability requirement with message queues is far less onerous.

> > This provides a lot of freedom for forcing the log much less often
> > than the ingestion rate.
>
> Sure. But that precise latitude is already exploited widely even in
> DBMSs that do not utilize message queues. After all, there is a
> considerable literature on logging policy alone.

No, with 2PC there is no latitude. Cohorts must flush when they acknowledge phase 1. Received on Mon Dec 31 2007 - 06:09:36 CET

Original text of this message