Re: Resiliency To New Data Requirements

From: dawn <dawnwolthuis_at_gmail.com>
Date: 11 Aug 2006 20:59:55 -0700
Message-ID: <1155355195.433608.11110_at_i42g2000cwa.googlegroups.com>

JOG wrote:
> dawn wrote:
> > JOG wrote:
> > > dawn wrote:
> > > > Marshall wrote:
> > > > > dawn wrote:
> > > > > >
> > > > > > I agree. If we are going to start somewhere and move forward, we might
> > > > > > be well-served to look to what works today outside of the RM (even
> > > > > > though it, of course, typically markets itself as relational). Is it
> > > > > > less expensive to work with Cache' than Oracle given such and such an
> > > > > > environment? If so, why?
> > > > >
> > > > > Is there theory behind any of this? Any mathematical models or other
> > > > > formalisms? It seems to me that comparing Cache with Oracle for
> > > > > TCO is not on-topic on c.d.t.
> > > > >
> > > > > Does any of "what works today outside of the RM" have any theory
> > > > > behind it? This is a theory newsgroup after all.
> > > >
> > > > Hi Marshall. The reason I originally came to this list was to learn
> > > > what it was about the theory that lead the industry down a path of
> > > > throwing out some good features such as lists, which I have used as my
> > > > primary example. I learned from this forum and elsewhere that the
> > > > theory has come back around to now permit nested structures, while a
> > > > huge amount of software implementations are stuck, for practical
> > > > purposes, with the flawed theory of what was once known as 1NF.
> > >
> > > I think the initial interpretation of 1NF was confused rather than
> > > 'flawed' - at the end of the day all theories are developed
> > > iteratively.
> >
> > OK, or perhaps "the use of 1NF" was flawed, while there is nothing
> > wrong with coining and defining it. I'm not sure that "nonsimple
> > domains" (in the definition) was ever nailed down as precise
> > mathematics. But if the mathematicians tell me the mathematics was not
> > flawed, then I'm good with that. It is the application of that
> > mathematics to data (the modeling of data) where my interests lie. The
> > mistake was requiring software development teams to model data in what
> > was termed 1NF.

> Well, either way, at least we are agreed that the modern relational
> model is not absolutely identical to its 36 year old counterpart, and
> some of the rough edges have been smoothed out.

Yes.

> Certainly accepts
> complex types. A lot of people don't realise this, and its up to us to
> let people know Dawn, especially those who create the software based on
> RM.

I'm working on it, in my own little way. Thanks for spreadin' the news too.

> >
> > > Of course in math, a relation can contain an element from
> > > any domain, and once RM became established this was picked up on
> > > relatively quickly.
> >
> > I gather that you mean "in theory" it was picked up relatively quickly.
> > I'm not heavily tapped into what everyone out there is doing, but my
> > pals are not defining new domains right and left.
> >
> > > I think it's pretty much accepted now that how one
> > > operates on that complex element is not within the remit of the RM
> > > itself, and as such the DBMS must handle its decomposition.
> >
> > This is a fine distinction, but I'll buy that "relational theory" can
> > define itself as working with relations and as "orthogonal" to the
> > question of what domains are supported. This can simply be a matter of
> > defintiion. I'm not so sure I can buy that the relational model, that
> > being the model of data that SQL attempted to implement whether they
> > missed the mark by a lot or a little, can claim that any operators are
> > irrelevant, including those that are specific to one domain or another.

> RM definitely allows complex types. If an implementation of it doesn't,
> it is a bug.

>


> >  Here is Date on Codd re the meaning of "data model"


> >

> > "Codd defines a data model in a 1980 paper Data models in database

> > management. By his definition a data model consists of a collection of

> > data structure types, operators that can be applied to instances of

> > these types and consistency rules that define valid states for the

> > data."

> >

> > Are these consistency rules only related to relationships between

> > relations?  Are they unrelated to the consistency of data values for an

> > attribute, the sets from which valid values may come?  Are contraints

> > related to domains outside of the scope of the relational model?  If

> > so, what is the name of the scope they are in?  Given that SQL

> > implements a model that is bigger/broader that includes specific

> > domains, for example, I need a name for what seems to me to be "a data

> > model" that is implemented (with flaws) by SQL. In your terminology
> > would that then be some sort of "uber data model"?
>
> It's the DBMS no?

That's not the name of the abstraction, that is the implementation. A whole raft of DBMS's look similar to each other and are often referred to as RDBMS's or RELATIONAL while another group might be called MUMPS, another PICK. The products are not the same -- Revelation (in the Pick family tree), is different from jBASE or UniData. SQL Server is different from Oracle. But jBASE and Revelation pretty much implement the same "data model" (PICK) as do SQL Server and Oracle (RELATIONAL).

> A complex type is still just an atom to the data model. Consider an
> image, a perfectly acceptable element - its just a list of pixels after
> all, but I don't expect my data model to natively handle its
> decomposition.

I'm OK if relational theory doesn't cover working with lists as long as relational theory is not all that is employed. We can use set processing for some things and pull in the list processing for others. It is only when relational operators demand exclusivity at some level that I have a problem with it.

> >
> > The good news is that even if theorists are split or narrowly define
> > the RM so that it no longer contains any of the issues it helped cause
> > in the industry, I think practioners would generally understand the
> > relational data model to be the model that (at least in the 80's)
> > forbade nested values, repeating groups, multivalues, non-1NF, or
> > whatever you want to call it. So I think when I speak about "the
> > relational model" with practitioners, they pretty much understand that
> > it disallows lists as attribute values, for example.

> As I said, we've done the research practically in tandem it seems. Time
> to tell those practitioners that they're mistaken.

I'm tryin' to (but it can be brutal)

> >
> > I suspect we could both agree to the terminology that it is the advent
> > of the relational model that brought about what was termed 1NF and
> > disallowed non-simple domains (such as lists) even if we define the RM
> > differently today.
> >
> > > And of
> > > course that's the way it should be given that dates, strings and other
> > > decomposable types have no relevance to relational theory.
> >
> > Again, I guess I'll go along with a redefinition of relational theory
> > that says that it no longer cares if the value of an attribute is
> > itself a relation. But once upon a time, it was definitely a player in
> > the problems that arose from the relational model (not just from the
> > implementations thereof).
> >
> > > > So I want to talk about theory and its relationship to practice. We
> > > > don't need another two decades of flawed tools that blindly try to
> > > > follow another flawed theory. The industry had lists, then pooh-poohed
> > > > them, and now is bringing them back, where "the theory" seems to now
> > > > permit nested sets (although there are still many who are not ready to
> > > > accept that extension of the theory), and lists are accepted if defined
> > > > as user-defined types. But there are still no list operations in the
> > > > theory as best I can tell. If theory people want to discuss theory
> > > > sans "end users" of the theory (like me), they can do their work in a
> > > > vacuum, but then perhaps the industry would be well-served if more of
> > > > it (than in the past) would stay there so we don't repeat the mistakes
> > > > of the past (e.g. normalization as originally defined being implemented
> > > > before it was ripe).
> > >
> > > I've seen you refer to this as "throwing the baby out with the bath
> > > water". I'm not sure at all thats a good analogy, as it infers that
> > > complex types were the most important factor involved - if they had
> > > been RM would have seriously struggled,
> >
> > OK, I'll buy that.
> >
> > > whereas the other important
> > > advantages of the model over its competitors proved to be overwhelming,
> > > and it dominated in good old darwinian fashion.
> >
> > I never thought of darwinianism in terms of the marketing buzz
> > surrounding survival of one technology and not another, but ... ;-)
> >
> > > Unfortunately recent 'advances' in db work such as XML databases seem
> > > to be attempting to retrieve the 'baby' by rebuilding a bathroom
> > > without any planning schematics, installing an upside-down bath and,
> > > worst of all, no plumbing system.
> >
> > And I'd have to say that the baby isn't coming to life in that area for
> > me yet either, but we might still in the pregnancy and the morning
> > sickness is awful (I'll skip my great anecdote on that one).
> >
> > > > So, while I want to talk about theory and its relationship to practice,
> > > > I'm not developing theory, and I don't know the totality of the theory
> > > > behind any di-graph models, for example. I suspect that there are many
> > > > here who would not accept anything other than set theory (functions are
> > > > sets, so I'm sure anything software developers do can be modeled as
> > > > sets if someone has a reason to do so.)
> > >
> > > Well, graph theory is constructed from set theory itself, a graph being
> > > defined as a triple (set of inputs, set of outputs, set of edges). It
> > > has a powerful theory layered on top of this definition and is hence
> > > applicable to a whole range of practicalities. Unfortunately
> > > information handling is not one of these
> >
> > Hmmm. www?

> I have personal friendships with some of the most influential people in
> hypertext and the internet. To a man they all loathe the hideous mess
> that is the web

And all of the women in software development in my county (OK, there might be another one somewhere) think it is beautiful in its usefulness, simplicity (well, maybe not, I can't believe how hard it was to make css and xhtml work in both IE and FF)

> and none would say it has anything to do with data
> modelling.

There are a lot of documents out there too, but use it as an app run-time container or an RSS feed and you start getting into structured data modeling, not just documents.

> It is a publishing medium (and a poor shadow of what it
> could have been at that),

Wouldn't it be better to say "could be"?

> but can hardly be processed as a data model.
> I'm guessing that your reference to www was tongue in cheek.

It was a counter-example. Your words were "information handling" so it worked.

> Graphs, when used at the logical level, cannot handle anything more
> than binary relationships,

I don't think of the question as "what can this handle" but "are these data structures and operators useful"? I like relations and relational operators and I also like navigating from node to node when it is helpful, such as from an order to its lineItems.

> and hey we know from irreducible tuples,
> that a lot of information _cannot_ be broken down into binary form.
> Thats exactly why graph databases and the Semantic Web are such a load
> of tosh.

I just deleted a clever response here, but the upshot is that I've read quite a bit of the semantic web stuff and I'm not a believer yet either.

> > > , given some information
> > > relationships will not fit into a binary approach such as graphs at the
> > > logical level*.
> >
> > I'll admit I don't know what you are referring to, but are these
> > relationships absolutely essential to your average software
> > applications? Where is the show-stopper?
> >
> > > Believe me, I spent an an incredibly frustrating year
> > > attempting to 'make them fit' before conceding defeat - looking back it
> > > seems a very naive period, but it was an invaluable education.
> >
> > Just as with relational theory covering relations and something else
> > covering domain operators in a single "data model" (or what I would
> > term one), we can partition the space so that one theory meets some
> > requirements for solutions and another meets another. It could even be
> > partitioned so that some types of problems or domains use this data
> > model and others use another, right? (I fully accept that I'm not
> > "getting it" on this point and you may certainly point that out).
> > Cheers! --dawn

> I don't think data model is the correct term for what you mean. There
> are an infinite number of possible domains (consider that image element
> example), and it hence requires a higher level (I'm not sure we need to
> leet it up by calling it uber) to handle decomposing them. People seem
> to be starting to talk about a relational language on cdt though, so
> this may be a good sign.

And I do like Tutorial-D's group and ungroup that at least gets at some of what I want. Add in a few more features for list handling at the top level, permit synonyms (of different types), and a bunch of other features and we just might be able to combine a good theory with good practice. Pick is practical in a big bang for the buck way. Relational theory is elegant in a mathematical way. I don't want to cheat on either theory or practice, but if they aren't going to align, I'm stickin' with practical and flexible. Cheers! --dawn

> J.

>
>


> >

> >

> > > Jim.


> > >

> > > (* That is apart from 'hypergraphs', which uninvuitively allow edges to

> > > connect > 2 nodes, so allowing the handling of n-ary relationships.

> > > However, then the set of edges essentially become a set of tuples - an

> > > n-ary relation, and we are left with somewhat of a reinvention of

> > > relational theory.)

> > >

> > > >

> > > > Did that clarify?  If so, is that, or is that not a valid discussion in

> > > > this forum?  (Don't worry, even if you suggest it is valid to discuss,

> > > > I will still keep a low profile here as I know there are some who

> > > > really, really dislike having me around and I prefer the company of

> > > > those who are at least civil in their discourse when they disagree with

> > > > someone, as you, David, mAsterdam, JOG, x, and many others have always

> > > > been).

> > > >
> > > > Cheers! --dawn
Received on Sat Aug 12 2006 - 05:59:55 CEST

This message: [ Message body ]
Next message: Bob Badour: "Re: Resiliency To New Data Requirements"
Previous message: Brian Selzer: "Re: A real world example"
In reply to JOG: "Re: Resiliency To New Data Requirements"
Next in thread: JOG: "Re: Resiliency To New Data Requirements"
Reply: Bob Badour: "Re: Resiliency To New Data Requirements"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

Original text of this message