Re: Multiple-Attribute Keys and 1NF

From: JOG <jog_at_cs.nott.ac.uk>
Date: Tue, 28 Aug 2007 17:05:54 -0700
Message-ID: <1188345954.269322.208250_at_d55g2000hsg.googlegroups.com>


On Aug 29, 12:42 am, Bob Badour <bbad..._at_pei.sympatico.ca> wrote:
> JOG wrote:
> > On Aug 29, 12:12 am, Bob Badour <bbad..._at_pei.sympatico.ca> wrote:
>
> >>JOG wrote:
>
> >>>On Aug 28, 8:23 pm, Bob Badour <bbad..._at_pei.sympatico.ca> wrote:
>
> >>>>JOG wrote:
>
> >>>>>On Aug 28, 7:41 pm, Bob Badour <bbad..._at_pei.sympatico.ca> wrote:
>
> >>>>>>JOG wrote:
>
> >>>>>>>On Aug 28, 5:43 pm, Bob Badour <bbad..._at_pei.sympatico.ca> wrote:
>
> >>>>>>>>JOG wrote:
>
> >>>>>>>>>On Aug 28, 5:05 pm, Bob Badour <bbad..._at_pei.sympatico.ca> wrote:
>
> >>>>>>>>>>JOG wrote:
>
> >>>>>>>>>>>I am still fighting with the theoretical underpinning for 1NF. As
> >>>>>>>>>>>such, any comments would be greatfully accepted. The reason for my
> >>>>>>>>>>>concern is that there /seems/ instances where 1NF is insufficient. An
> >>>>>>>>>>>example occurred to me while I was wiring up a dimmer switch (at the
> >>>>>>>>>>>behest of mrs. JOG, to whom there may only be obeyance). Now I don't
> >>>>>>>>>>>know the situation in the US, but in the UK a while back the colour
> >>>>>>>>>>>codes for domestic main circuit wiring changed. Naturally the two
> >>>>>>>>>>>schemes exist in tandem, as exhibited in every house I've had the joy
> >>>>>>>>>>>of doing some DIY in:
>
> >>>>>>>>>>>Brown -> live.
> >>>>>>>>>>>Red -> live
> >>>>>>>>>>>Blue -> neutral.
> >>>>>>>>>>>Black -> neutral.
> >>>>>>>>>>>Green and yellow -> earth.
>
> >>>>>>>>>>>The issue with encoding these propositions is that the candidate key
> >>>>>>>>>>>for each proposition may consist of one _or_ two colours. Now I have a
> >>>>>>>>>>>couple of options, none of which seem satisfactory. I could leave
> >>>>>>>>>>>green & yellow as some sort of set-value composite, but obviously this
> >>>>>>>>>>>would affect my querying capabilities, so thats out straight off the
> >>>>>>>>>>>bat. Similarly adding attributes Colour1 and a nullable Colour2 is
> >>>>>>>>>>>simply so hideous it isn't worth consideration. So, I could ungroup to
> >>>>>>>>>>>give me:
>
> >>>>>>>>>>>Colour Type
> >>>>>>>>>>>-----------------
> >>>>>>>>>>>Brown live
> >>>>>>>>>>>Red live
> >>>>>>>>>>>Black neutral
> >>>>>>>>>>>Blue neutral
> >>>>>>>>>>>Green earth
> >>>>>>>>>>>Yellow earth
> >>>>>>>>>>>-----------------
>
> >>>>>>>>>>>But again this is unsatisfactory as I have lost the information that
> >>>>>>>>>>>one wire is green and yellow, but none is brown /and/ red.
>
> >>>>>>>>>>>I could introduce a surrogate to give me:
>
> >>>>>>>>>>>Id Colour Type
> >>>>>>>>>>>-----------------
> >>>>>>>>>>>1 Brown live
> >>>>>>>>>>>2 Red live
> >>>>>>>>>>>3 Black neutral
> >>>>>>>>>>>4 Blue neutral
> >>>>>>>>>>>5 Green earth
> >>>>>>>>>>>5 Yellow earth
> >>>>>>>>>>>-----------------
>
> >>>>>>>>>>>But this seems wholly artificial given that all the information I
> >>>>>>>>>>>required for identification was available in the original
> >>>>>>>>>>>propositions, and that did not require some artificial id. A [shudder]
> >>>>>>>>>>>non 1NF variation such as:
>
> >>>>>>>>>>>Id Colour Type
> >>>>>>>>>>>-----------------
> >>>>>>>>>>>1 Brown live
> >>>>>>>>>>>2 Red live
> >>>>>>>>>>>3 Black neutral
> >>>>>>>>>>>4 Blue neutral
> >>>>>>>>>>>5 Green, earth
> >>>>>>>>>>>Yellow
> >>>>>>>>>>>-----------------
>
> >>>>>>>>>>>is clearly hideous as it denies the fundamental mathematical principle
> >>>>>>>>>>>that that one attribute should take one value from one domain,
> >>>>>>>>>>>nevermind the fact that it introduces query bias.
>
> >>>>>>>>>>>I could of course introduce nested relations, but I am uncertain as to
> >>>>>>>>>>>the theoretical consequences of having nested relation as a key (I
> >>>>>>>>>>>guess it would be fine, if adding seemingly unnessecary complexity to
> >>>>>>>>>>>subsequent queries). But moreover it again seems unintuitive, given
> >>>>>>>>>>>that in this case it would indicating that the original propositions
> >>>>>>>>>>>contained, as a value for one of their attributes, a further
> >>>>>>>>>>>proposition, and this was not the case.
>
> >>>>>>>>>>>I am having a crisis of faith with the way 1NF is currently viewed.
> >>>>>>>>>>>Any ideas to solve my dilemma? Am I on my own in being perturbed?
>
> >>>>>>>>>>>Regards, Jim.
>
> >>>>>>>>>>There is one obvious way to represent that in 1NF:
>
> >>>>>>>>>>Create a color domain where a single value represents green and yellow,
> >>>>>>>>>>another value represents green, and a third represents yellow etc. The
> >>>>>>>>>>domain could even represent thick green/thin yellow as a separate value
>
> >>>>>>>>>>from thick yellow/thin green if one chooses.
>
> >>>>>>>>>>Regardless whether one creates only the domain or also uses it as a
> >>>>>>>>>>candidate key for some sort of lookup table, the resulting relation is
> >>>>>>>>>>simply:
>
> >>>>>>>>>>Colour Type
> >>>>>>>>>>======= -------
> >>>>>>>>>>...
>
> >>>>>>>>>>Your ID above is one example of such a domain. However, the domain need
> >>>>>>>>>>not be numeric or have any external numeric representations. It need
> >>>>>>>>>>only exist with a distinct value for green and yellow.
>
> >>>>>>>>>Well, practically, the surrogate key is the way that I would go. My
> >>>>>>>>>question is rather whether this corresponds naturally to the original
> >>>>>>>>>propositions, which don't require a new domain in order to be
> >>>>>>>>>manipulated in FOL.
>
> >>>>>>>>You assume a color domain so imagining a different color domain changes
> >>>>>>>>the design without adding anything new.
>
> >>>>>>>Okay, you're right - not a new domain, just a different one. If I had
> >>>>>>>started with domain of all colours C (clearly containing the colour
> >>>>>>>"grey" given the presence of the u there), I read you as proposing
> >>>>>>>that it be replaced with a labelled powerset of C. Howwwever, would
> >>>>>>>occams razor not suggest that we should prefer a domain made up of
> >>>>>>>atomic individuals, as opposed to aliased sets, which will require an
> >>>>>>>extra step to decompose?
>
> >>>>>>I don't recall suggesting anything about sets--just a domain that has a
> >>>>>>distinct value that means "green and yellow".
>
> >>>>>Okay, sure. But then to be able to query for green and yellow
> >>>>>individually one must employ a further relation encoding two more
> >>>>>propositions that state "'Green and yellow' contains 'Green'" and that
> >>>>>"'Green and yellow' contains 'Yellow'" respectively. One then has a
> >>>>>schema with two domains - one for the composites and one for
> >>>>>individual colours (which is what I was inferring when I initially
> >>>>>said a new one was being added).
>
> >>>>Assuming one has a need to query for green separately, I suppose one can
> >>>>define an operator on the domain to that effect. If one invents a
> >>>>requirement that requires a second domain, then one will need a second
> >>>>domain regardless.
>
> >>>Well that sort of brings us full circle back to to my query as to
> >>>whether a structure that doesn't require that second domain,
>
> >>Let me be clear: Unless you invent a requirement that requires a second
> >>domain, no second domain is required. If one invents such a requirement,
> >>one is.
>
> >>such as a
>
> >>>set where elements themselves are pure mathematical relations
> >>>containing attribute/value pairs:
>
> >>>Wires = { {(Color, Yellow), (Color, Green), (Type, earth)}} {(Color,
> >>>blue), (Type, live)} }
>
> >>But a set is a second domain. You have 1) colors and 2) sets of colors.
> >>Actually, you have sets of some supertype of color and type?!? Yuck!
>
> > I'm worrying that you have misinterpreted what I have sketched out
> > there. I haven't specified any domain sets at all in the above - it
> > is just a set of propositions, and as with RM, each element is a
> > mapping from attribute names onto values, that's all. I have no idea
> > why you think I have supertypes, etc, in there. (which I agree would
> > be yuck)
>
> Based on this set: {(Color, Yellow), (Color, Green), (Type, earth)}

Still not seeing where you get supertypes from. I just see a mapping of roles in a proposition to corresponding values.

> That is not a tuple. A tuple would be:
>
> {(Color, {Yellow, Green}), (Type, earth)}

Yes, I realize it is not a db-tuple, because if one relaxes 1NF then one doesn't have a db-relation at all. That set-valued element still represents a proposition however, and is in fact a relation in the true mathematical sense. I find this representation interesting because a JOIN becomes a union of these elements, and a natural join is generated by default as one would expect.

>
> The names must be unique within a tuple.
>
>
>
> >>>has any negative theoretical impacts. I can see immediately that this
> >>>would affect WHERE and ON clauses in the algebra, and one would get
> >>>more use out of an GROUP/UNGROUP statements, but I see nothing
> >>>inherently /bad/.
>
> >>>Yet that is.
>
> >>Supposing you have a requirement that you must be able to use the green
> >>and the yellow separately and supposing you choose to use a set of
> >>values (i.e. an RVA), you have already identified the problem that when
> >>one ungroups, one loses the information that green and yellow belong as
> >>a pair.
>
> >>To preserve this information, the dbms would have to have some facility
> >>to generate an artificial identifier for the pair. However, if one
> >>normalized the base relations, one would already have the identifier,
> >>and it would be rather simple to construct the RVA in a derived relation.
>
> > Much food for thought. Thanks for the responses Bob.
>
> You are very welcome. Some of this stuff seems obvious to me now, but at
> one time it was anything but.
Received on Wed Aug 29 2007 - 02:05:54 CEST

Original text of this message