Re: Resiliency To New Data Requirements

From: JOG <jog_at_cs.nott.ac.uk>
Date: 17 Aug 2006 04:31:34 -0700
Message-ID: <1155814294.051678.9630_at_75g2000cwc.googlegroups.com>


erk wrote:
> JOG wrote:
> > All statements are 'structured' otherwise we wouldn't understand them.
>
> I disagree; "structured" implies some complex arrangement of parts. As
> such, the atomic "nodes" of said structure are (by definition)
> unstructured. That doesn't mean the user can't define functions over
> these values' types/domains.

I was referring to gramatically structured.

>
> > Hence using the term in that sense is of no use. Similarly all data can
> > be put into a relational database, but that hardly makes it a database
> > before its there.
>
> An obvious but important point, and well said.
>
> > Structured/Semi-structured/Unstructured terminology has been standard
> > for 15 years. Even before that in 1979 Codd himself made exactly the
> > same distinctions, except he used the terms formatted and unformatted,
> > for structured and unstructured respectively. He just never cemented
> > the phrases.
>
> If there are standard definitions for these terms, I'm unaware of them
> - do you have references?

[Boehm 1978] 'structuredness' used by in relation to programming.
[Codd 1979] referred to structured (formatted) data.
[Belkin et al. 1992] referred to "unstructured or semistructured
data" for information retrieval
[Quass et al. 1995] referring to data with highly irregular schema. [Abiteboul 1997] cemented the terminology in "Querying semi-structured data"
etc..

Be warned however, although several of the above papers popularized the terminology, most of the work done with unstructured/semi-structured data will make anyone versed in database theory feel distinctly faint.

Jim.

>
> - erk
Received on Thu Aug 17 2006 - 13:31:34 CEST

Original text of this message