Re: Multiple-Attribute Keys and 1NF

From: JOG <jog_at_cs.nott.ac.uk>
Date: Tue, 28 Aug 2007 16:24:49 -0700
Message-ID: <1188343489.083058.22100_at_50g2000hsm.googlegroups.com>


On Aug 29, 12:12 am, Bob Badour <bbad..._at_pei.sympatico.ca> wrote:
> JOG wrote:
> > On Aug 28, 8:23 pm, Bob Badour <bbad..._at_pei.sympatico.ca> wrote:
>
> >>JOG wrote:
>
> >>>On Aug 28, 7:41 pm, Bob Badour <bbad..._at_pei.sympatico.ca> wrote:
>
> >>>>JOG wrote:
>
> >>>>>On Aug 28, 5:43 pm, Bob Badour <bbad..._at_pei.sympatico.ca> wrote:
>
> >>>>>>JOG wrote:
>
> >>>>>>>On Aug 28, 5:05 pm, Bob Badour <bbad..._at_pei.sympatico.ca> wrote:
>
> >>>>>>>>JOG wrote:
>
> >>>>>>>>>I am still fighting with the theoretical underpinning for 1NF. As
> >>>>>>>>>such, any comments would be greatfully accepted. The reason for my
> >>>>>>>>>concern is that there /seems/ instances where 1NF is insufficient. An
> >>>>>>>>>example occurred to me while I was wiring up a dimmer switch (at the
> >>>>>>>>>behest of mrs. JOG, to whom there may only be obeyance). Now I don't
> >>>>>>>>>know the situation in the US, but in the UK a while back the colour
> >>>>>>>>>codes for domestic main circuit wiring changed. Naturally the two
> >>>>>>>>>schemes exist in tandem, as exhibited in every house I've had the joy
> >>>>>>>>>of doing some DIY in:
>
> >>>>>>>>>Brown -> live.
> >>>>>>>>>Red -> live
> >>>>>>>>>Blue -> neutral.
> >>>>>>>>>Black -> neutral.
> >>>>>>>>>Green and yellow -> earth.
>
> >>>>>>>>>The issue with encoding these propositions is that the candidate key
> >>>>>>>>>for each proposition may consist of one _or_ two colours. Now I have a
> >>>>>>>>>couple of options, none of which seem satisfactory. I could leave
> >>>>>>>>>green & yellow as some sort of set-value composite, but obviously this
> >>>>>>>>>would affect my querying capabilities, so thats out straight off the
> >>>>>>>>>bat. Similarly adding attributes Colour1 and a nullable Colour2 is
> >>>>>>>>>simply so hideous it isn't worth consideration. So, I could ungroup to
> >>>>>>>>>give me:
>
> >>>>>>>>>Colour Type
> >>>>>>>>>-----------------
> >>>>>>>>>Brown live
> >>>>>>>>>Red live
> >>>>>>>>>Black neutral
> >>>>>>>>>Blue neutral
> >>>>>>>>>Green earth
> >>>>>>>>>Yellow earth
> >>>>>>>>>-----------------
>
> >>>>>>>>>But again this is unsatisfactory as I have lost the information that
> >>>>>>>>>one wire is green and yellow, but none is brown /and/ red.
>
> >>>>>>>>>I could introduce a surrogate to give me:
>
> >>>>>>>>>Id Colour Type
> >>>>>>>>>-----------------
> >>>>>>>>>1 Brown live
> >>>>>>>>>2 Red live
> >>>>>>>>>3 Black neutral
> >>>>>>>>>4 Blue neutral
> >>>>>>>>>5 Green earth
> >>>>>>>>>5 Yellow earth
> >>>>>>>>>-----------------
>
> >>>>>>>>>But this seems wholly artificial given that all the information I
> >>>>>>>>>required for identification was available in the original
> >>>>>>>>>propositions, and that did not require some artificial id. A [shudder]
> >>>>>>>>>non 1NF variation such as:
>
> >>>>>>>>>Id Colour Type
> >>>>>>>>>-----------------
> >>>>>>>>>1 Brown live
> >>>>>>>>>2 Red live
> >>>>>>>>>3 Black neutral
> >>>>>>>>>4 Blue neutral
> >>>>>>>>>5 Green, earth
> >>>>>>>>>Yellow
> >>>>>>>>>-----------------
>
> >>>>>>>>>is clearly hideous as it denies the fundamental mathematical principle
> >>>>>>>>>that that one attribute should take one value from one domain,
> >>>>>>>>>nevermind the fact that it introduces query bias.
>
> >>>>>>>>>I could of course introduce nested relations, but I am uncertain as to
> >>>>>>>>>the theoretical consequences of having nested relation as a key (I
> >>>>>>>>>guess it would be fine, if adding seemingly unnessecary complexity to
> >>>>>>>>>subsequent queries). But moreover it again seems unintuitive, given
> >>>>>>>>>that in this case it would indicating that the original propositions
> >>>>>>>>>contained, as a value for one of their attributes, a further
> >>>>>>>>>proposition, and this was not the case.
>
> >>>>>>>>>I am having a crisis of faith with the way 1NF is currently viewed.
> >>>>>>>>>Any ideas to solve my dilemma? Am I on my own in being perturbed?
>
> >>>>>>>>>Regards, Jim.
>
> >>>>>>>>There is one obvious way to represent that in 1NF:
>
> >>>>>>>>Create a color domain where a single value represents green and yellow,
> >>>>>>>>another value represents green, and a third represents yellow etc. The
> >>>>>>>>domain could even represent thick green/thin yellow as a separate value
>
> >>>>>>>>from thick yellow/thin green if one chooses.
>
> >>>>>>>>Regardless whether one creates only the domain or also uses it as a
> >>>>>>>>candidate key for some sort of lookup table, the resulting relation is
> >>>>>>>>simply:
>
> >>>>>>>>Colour Type
> >>>>>>>>======= -------
> >>>>>>>>...
>
> >>>>>>>>Your ID above is one example of such a domain. However, the domain need
> >>>>>>>>not be numeric or have any external numeric representations. It need
> >>>>>>>>only exist with a distinct value for green and yellow.
>
> >>>>>>>Well, practically, the surrogate key is the way that I would go. My
> >>>>>>>question is rather whether this corresponds naturally to the original
> >>>>>>>propositions, which don't require a new domain in order to be
> >>>>>>>manipulated in FOL.
>
> >>>>>>You assume a color domain so imagining a different color domain changes
> >>>>>>the design without adding anything new.
>
> >>>>>Okay, you're right - not a new domain, just a different one. If I had
> >>>>>started with domain of all colours C (clearly containing the colour
> >>>>>"grey" given the presence of the u there), I read you as proposing
> >>>>>that it be replaced with a labelled powerset of C. Howwwever, would
> >>>>>occams razor not suggest that we should prefer a domain made up of
> >>>>>atomic individuals, as opposed to aliased sets, which will require an
> >>>>>extra step to decompose?
>
> >>>>I don't recall suggesting anything about sets--just a domain that has a
> >>>>distinct value that means "green and yellow".
>
> >>>Okay, sure. But then to be able to query for green and yellow
> >>>individually one must employ a further relation encoding two more
> >>>propositions that state "'Green and yellow' contains 'Green'" and that
> >>>"'Green and yellow' contains 'Yellow'" respectively. One then has a
> >>>schema with two domains - one for the composites and one for
> >>>individual colours (which is what I was inferring when I initially
> >>>said a new one was being added).
>
> >>Assuming one has a need to query for green separately, I suppose one can
> >>define an operator on the domain to that effect. If one invents a
> >>requirement that requires a second domain, then one will need a second
> >>domain regardless.
>
> > Well that sort of brings us full circle back to to my query as to
> > whether a structure that doesn't require that second domain,
>
> Let me be clear: Unless you invent a requirement that requires a second
> domain, no second domain is required. If one invents such a requirement,
> one is.
>
> such as a
>
> > set where elements themselves are pure mathematical relations
> > containing attribute/value pairs:
>
> > Wires = { {(Color, Yellow), (Color, Green), (Type, earth)}} {(Color,
> > blue), (Type, live)} }
>
> But a set is a second domain. You have 1) colors and 2) sets of colors.
> Actually, you have sets of some supertype of color and type?!? Yuck!

I'm worrying that you have misinterpreted what I have sketched out there. I haven't specified any domain sets at all in the above - it is just a set of propositions, and as with RM, each element is a mapping from attribute names onto values, that's all. I have no idea why you think I have supertypes, etc, in there. (which I agree would be yuck)

>
> > has any negative theoretical impacts. I can see immediately that this
> > would affect WHERE and ON clauses in the algebra, and one would get
> > more use out of an GROUP/UNGROUP statements, but I see nothing
> > inherently /bad/.
>
> > Yet that is.
>
> Supposing you have a requirement that you must be able to use the green
> and the yellow separately and supposing you choose to use a set of
> values (i.e. an RVA), you have already identified the problem that when
> one ungroups, one loses the information that green and yellow belong as
> a pair.
>
> To preserve this information, the dbms would have to have some facility
> to generate an artificial identifier for the pair. However, if one
> normalized the base relations, one would already have the identifier,
> and it would be rather simple to construct the RVA in a derived relation.

Much food for thought. Thanks for the responses Bob. Received on Wed Aug 29 2007 - 01:24:49 CEST

Original text of this message