Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: Surrogate Key vs Production Key

Re: Surrogate Key vs Production Key

From: Noons <wizofoz2k_at_yahoo.com.au>
Date: 21 Oct 2004 21:49:53 -0700
Message-ID: <73e20c6c.0410212049.3485ef7e@posting.google.com>


ed.prochak_at_magicinterface.com (Ed prochak) wrote in message news:<4b5394b2.0410210917.7d6340f1_at_posting.google.com>...

>
> SO you have never had a programming bug that touched your surrogate
> values?

Yes I have, and I have recovered using the recovery mechanisms of the database. Period.

> Lucky you.
> (Better keep your favorite good luck charm always close by!)

You know, that "luck" argument is wearing a bit thin. Luck has nothing to do with what I do professionaly. That's why I've been in this industry for 29 years. No amount of luck would have protected me for that long if it was the main factor.

> Look RI between the PK and FK has nothing to do with it.

WHAT? Are you sure you meant to say that monstrosity?

> My point is
> there is NO RELATIONSHIP between the surrogate and the data it
> represents.

Yes, PRECISELY. Why do you insist on assigning it one? Who said there had to be one?

> The assignment of the surrogate is completely arbitrary
> and therefore impossible to validate.

WTH for would I want to validate it? You either use a database, which does that validation for you, or you don't. If the latter, why the heck are you replyin to this thread? It's about databases.

> table B
> idB idA ... other columns
> 1 2 123 baker street
> 2 1 987 signa Drive
>
> Now, what FK to PK RI constraint did I violate?

None whatsoever. What, you trying to tell me that if you were using natural keys in your example, one would have been violated? Hello?????

> If this happened to two rows out of 100,000 addresses, how would you
> find it?

The same way I'd find it if I was using a natural key: checking the data history with a known source. Or running a database recovery if it had been caused by sofware. If you enter the wrong father/child data with natural keys, you have EXACTLY the same problem. What are you trying to prove with this?

> Alternate optional question: how do you design your model to enable
> recovery from this? (yes there are ways, but they reduce some of the
> advantages you speak of for SKs)

There you go again! Subverting the recovery mechanisms of the database without the slightest need. There is NOTHING that mandates you have to build into your design the recovery of your data, IF you are using a database.

If you are using a "SQL-engine" like so many idiots do nowadays, THEN *YES* you have to build in the recovery: no one will do it for you otherwise!

I'm assuming ALL ALONG that we are talking about relational DATABASES. So did weveryone else in this thread. Not crap "engines". If you want to include those now, then you are most welcome. I just wish you had been clear about that since the start.

> It has NOTHING to do with RI as I just showed you. bad SK values can
> get into the system even with strong RI in place.

So can bad natural keys! What stops you from using the wrong father or child name? There is NOTHING in natural keys that ensure their correctness on data entry. Once again, you are confusing data validation with referential integrity/relationship integrity.

> > of surrogate versus natural keys? Since WHEN does the
> > use of surrogate keys preclude the use of RI? Helloooooo?
>
> This is your arguement about RI, it is not mine.

No it most definitely isn't. I NEVER claimed that SKs do replace RI. NOWHERE.
Like I said: instead of jumping the gun, READ and THINK about what I wrote.

> I picked that example precisely because I've experienced such problems
> on BOTH sides (the names corrupted and the surrogate key moved). Many
> times to recovery required manual review, but a human can look at john
> smithe in the child table and guess that maybe it is really belongs
> to john smith.

Oh yeah? So, when you send your john smithe row to john smith and it should have been John Smith, whose fault is it then? What happens when John Smith sues you for sending him a wrong traffic fine?
Bad guess, eh?

> and in the other example it may be clear that the child
> record with id 1001 does not belong to heidi buchenwald, but I have no
> clue how to restore it back to john smith's id.

You have no idea with the natural key either. You are just assuming that your a-priori clue will be correct. With no proof whatsoever of such other than it being part of your a-priori assumed knowledge.

I assume NOTHING about my data, unless the user has done so for me. It is up to the user to establish the validity of data. Not to me to assume what it should have been.

> Again, you are lucky having NEVER had a programming bug, never having
> to convert your data into another system.

Done nothing else for the last 29 years. As for conversions, don't get me started...

> But now you changed your mind. Before the argument for SKs was saving
> data space. you don't want to waffle in your views this late in the
> campaign, er discussion. 8^)

What? Show me where did I say it was saving data space? I said from the very start that I didn't mind the small overhead of a SK for the gains I get. However, I STRONGLY object to the enormous overhead of a natural key, for zero gain in data integrity other than subjective warm feelings about being able to recognize a "wrong" key.

> Since when?

Since forever.

> I've seen more than one database engine that internally is a network
> model, but supports a SQL query engine. The only difference is the
> wider range of performance on queries.
> (queries that match the model are fast, those that don't can be VERY
> SLOoow)

So? A relational database engine is NOT the same as an ad-hoc query product.

> and this is modeled by POINTERS so no data is repeated fom the parent

pointers are NOT surrogate keys and vicky the versa. Case closed as far as analogies with older database architectures.

> to the child table, there's only a link between them. But that doesn't
> stop you from asking about relationships between to entities in the
> model that are not directly connected. You may have to follow some
> long chains to make the connection that's all.

You missed the point. A relational database lets you join two tables on columns that are NOT "linked" by ANY assumed relationship. You can even join on a logical comparison between two expressions. THAT is a relationship on the fly. Something you will never be able to do in a network or hierarchical model, no matter how many paths you traverse. Other than by brute force programmatic data traversal.

> But with SK's you have the same problem. Simple example: given your SK
> database, how do you find all the BOOKS owned by john smith. You have
> to follow the link, selecting from BOTH the PERSONS table and the
> BOOKS table joining on the SK,

Absolutely. THAT is what a relational database is for. Why do you even presume that joins are not desirable?

> while a natural key model would just
> look in the BOOKS table, because the owner would be there.

And that is called denormalisation, with SKs. Once again, you are bringing in subjects that have NOTHING to do with natural vs surrogate keys and RI. If you NEED to denormalise using SKs, you do PRECISELY that. NOTHING to stop you from doing so. Another moot point.

> Please don't speak for me. Once you pick surrogate keys, there's
> nothing dynamic about your logical data model.

A data model better not be dynamic or else someone didn't do their job. I said: "relational is so flexible that it may indeed allow emulation of what a network or hierarchical database does. And dynamically". That has nothing to do with data models, or surrogate keys for that matter. Once again, an unwarranted jump sideways.

> model. I'm trying to teach you that you are working in a network
> LOGICAL model.

Are you insane?

> Try not to confuse yourself.

Confused? The one here changing the universe of discussion is you. Constantly.

> Just because you use a
> product touted as a Relational DBMS that your specific data model is
> relational.

I won't even try to reason that one....

> Find surrogate keys in Codd's rules. I really don't think you will.

Funny. Joe didn't have any problem in bringing Codd's rules for surrogate keys, in his reply. Did you actually READ what he wrote?

> Maybe you should look again at what I said.
> I've seen these problems
> happen.

I did.

So have I. With incomplete half-baked designs purporting to be "relational", where all RI is handled by the application instead of the database. What keys were used is completely irrelevant.

> I've had to go in and clean up the mess.

Story of my life. With design and code.

> Call me when it hapens to you.

Doubt it. My designs have builtin safeguards against this sort of problem. Not one so far has ever needed intervention in this area. And that's a LOT of years.

> I could use some new clients.

I can see that.
(sorry, cheap shot)

> Meanwhile, I'm tired of
> presenting examples that you apparently don't read.

You presented ONE example that was completely irrelevant and started from a premise about surrogate keys that I never claimed to be correct. What can I say? You don't even spend the time to try and understand what is being said, in your haste to sound patronizingly correct.

> I think I'll just watch Joe and others try to explain this to you.
> (not that they are having much more success.)

I'll be convinced when the industry as a whole even begins to switch to the crazy ideas exposed by you lot.

And when someone gives me an example that proves a natural key can implicitly validate itself "because it's a natural key". Instead of silly examples a child would need 30 seconds to put a hole through.

Until then, I'll do what everyone does and be safe. Rather than sorry or needing to fix data, corrupted by incomplete design or deranged implementations. Received on Thu Oct 21 2004 - 23:49:53 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US