Re: Resiliency To New Data Requirements

From: erk <eric.kaun_at_gmail.com>
Date: 17 Aug 2006 08:49:18 -0700
Message-ID: <1155829758.723745.226990_at_75g2000cwc.googlegroups.com>


dawn wrote:
> > The word "structured" here is a waste of time. To use a programming
> > analogy, a 10,000 line program of imperative spaghetti code has
> > structure - one could argue too much, of the graph sort, while too
> > little of the module sort or function sort. "Structured" as a boolean
> > (or fuzzy-logic) function isn't the point -
>
> Sorry, I thought that was the topic at hand. I understand that
> discussions about which structure something takes might be of more
> interest. But I do think there is a distinction between what is
> structured text and what is unstructured text.

I don't think so. It's the endless "what vs. how" debate - it goes all the way down and all the way up. Which is which depends on your point of view. Structures are more or less useful in different contexts, and ground our viewpoints.

> Unstructured text would
> be treated by a database like an attribute with values that are mp3's,
> except character compard to binary data. Any structure to the data
> sits above (or below) this unstructured data.

And I think it's only in information exchange that the issue of parsing (translating a blob of characters or bytes into a structure) is so important. For data management, values are merely that, and unstructured blobs are useful primarily as an historical artifact ("we received this blob on this date at this time").

> I agree that is also relevant, but would not want to dismiss the
> relevance of talking about structured vs unstructured data.

"Structured" isn't an adjective. It's a family of them. "XML-structured" vs. "relation-structured" vs. "tree-structured" etc. Not necessarily mutually exclusive.

> > "Unstructured" just means uninterpreted in a given context, and that
> > means a type or domain in the control of the users.
>
> Yes!

Which also implies that the DBMS is ignorant of it. Users "manipulate" it using the functions at their disposal.

> > > R(URL, html, foreignKeyList)
> > >
> > > That's some structure, right?
> >
> > Sure - without the pesky underlying semantics. Each user of a piece of
> > such a "structure" is going to need to layer semantics atop this, with
> > functions to decompose and combine pieces of it.
>
> Sure, there is a lot that can be done with such a structure. Perhaps
> we only disagree on the double quotes around the word structure ;-)

This is the classic generality trap: it's so general you can do anything with it, but nothing easily or "maintainably." Furthermore, it lacks symmetry which is so useful in relations and functional programming paradigms.

> > "Structured" means, roughly, having a form or pattern of composition
> > (assumes component parts). In that sense, every "node" of a structure
> > is equally unstructured (all domains are equal, and while each can have
> > functions which evaluate to values of some type, these functions are
> > all orthogonal to the structure).
>
> Yes. So are we then agreed that the web has a structure, even if every
> node is equally unstructured? --dawn

The web has a structure, one moderately useful for browsing but too impoverished for data management.

  • erk
Received on Thu Aug 17 2006 - 17:49:18 CEST

Original text of this message