Re: Network databases

From: <lynn_at_garlic.com>
Date: 13 Jan 2005 08:23:43 -0800
Message-ID: <1105633423.355972.274810_at_f14g2000cwb.googlegroups.com>


Alfredo Novoa wrote:
> I can't disagree more!
>
> It is exactly the contrary. There is little difference if you have
two
> tables or less, but the difference increases exponentially when the
> number of tables grows.
>
> It is very easy to manage many tables at the same time using
> updateable views. And you might create views that use views.

hum, yes, well; there was this bldg. in san jose referred to as sjr or bldg. 28. I had an office on the 1st floor, backus had an office down the hall and codd had an office above on the second floor. there was this project going on in sjr to implement something called system/r and sequel, random system/r past posts
http://www.garlic.com/~lynn/subtopic.html#systemr

i've joked that SQL was part of competition between san jose research and yorktown research ... where query-by-example was going on for best TSL ...
aka QBE vis-a-vis SQL. some random qbe past posts

http://www.garlic.com/~lynn/2002e.html#44 SQL wildcard origins?
http://www.garlic.com/~lynn/2002o.html#70 Pismronunciation
http://www.garlic.com/~lynn/2003n.html#11 Dreaming About Redesigning
SQL
http://www.garlic.com/~lynn/2003n.html#18 Dreaming About Redesigning SQL
http://www.garlic.com/~lynn/2004l.html#44 Shipwrecks

about 10 miles south there was this other bldg ... santa teresa lab or bldg. 90. It had opened the same week as the Smithsonian air & space museum. it had access methods, databases, and language products. I would work some number of days in bldg. 90 ... riding my bike. south silicon valley/coyote valley has this interesting weather pattern where I would have a strong head wind riding south in the morning and a strong head wind riding north in the late afternoon.

so the physical databases of the 50 and 60s were developed when there was limited real disk space and limited real memory. direct physical pointers conserved the limited amount of scarce resources. they even had structures like isam ... where you could write I/O programs that could pickup physical pointers and follow them outboard in the i/o subsystem w/o bothering the processor. some amount of past postings about changing constrained physical resources in the 70s: http://www.garlic.com/~lynn/subtopic.html#dasd

I had started writting some stuff that over a 10-15 year period that the relative disk system performance had declined by an order of magnitude (memory, disk capacity, processor had increased by factors of 50, disk access thruput had only increased by 3-5 times). this annoyed the disk division and the disk division performance group was assigned to refute my statements. after a couple months and came back and said that i had slightly understated the issue.

in any case, during the 70s, real storage, disk space and processor were increasing dramatically, the cost of hardware was declining and the cost of people was increasing. also with the increase in disk space sizes, the amount of data that had to be manually managed was increasing significantly.

the arguments about system/r doubling the disk space and having layered index between ... was becoming less of an issue because the hardware costs were dropping and the relative amount of disk space was increasing. it was also now possible to start caching lots of the index structure in the increasing amounts of real storage available (instead of incrementally threading thru the index structure because there was no excess real storage to keep any cached information around).

All of this was being traded off against savings in people time (becoming scarcer and more expensive) which were having to deal with increasing size of data to be managed (by the relative increase in disk space sizes).

So with some amount of resistance continuing from bldg.90 and database product organization ... the system/r tech transfer went from bldg28 to endicott to become sql/ds. later there was sort of tech transfer from endicott to bldg.90 to become db2.

so somewhat in parallel with some of this ... there was small contingent in blg. 90 looking at doing a "modern" network database implementation ... doing a lot of abstracting so that the database users are separated from a lot of low-level physical database gorp ... in much the same way that system/r had abstracted a lot of those details in relational. Some amount of the higher level abstraction work was also influenced by Sowa. So they came up with a query language paradigm that removed the physical pointer and lots of the network navigation characteristics from the interface (anologous to what SQL accomplished). eventually they came to wanted to do a side-by-side comparison with db2 on a level playing field.

somewhat west of 28 about 10 miles was bldg. 29 or the los gatos lab. I'm
not sure all of its history, it was built in the 60s and housed ASDD for
a time (possibly even advanced system development division hdqtrs). It seemed that ASDD sort of evaporated with the death of FS ... random FS past postings
http://www.garlic.com/~lynn/subtopic.html#futuresys

They had done AM1/AM0 there ... which had eventually morphed into VSAM and became responsibility of product group in bldg. 90.

At the time of the side-by-side comparison, most of bldg. 29 was occupied by VSLI chip design group. For the comparison they choose an extremely network oriented structure, large CPU chip ... all the circuits that goes into the chip (and pretty non-uniform ... not like what you might find in something like a memory chip). On the same machine with the same system and operations ... load the chip specification into the database. The comparison would be elapsed time from start of initial query until chip was drawn on screen ... no tuning and no optimization.

The SQL query statements were on the order of 3-5 times larger and more complex ... and it quickly became clear that with level playing field, a side-by-side comparison of untuned and unoptimized, DB2 was ten times slower. So to make it a little more fair to DB2, the whole thing was given to some DB2 performance gurus for a couple weeks ... they were allowed to use every DB2 trick in the book, trace the query to death and re-org it every way possible. There were eventually able to get totally optimized DB2 so that it was only three times slower than the untuned and unoptimzed comparison.

Now, it was easy to show that DB2 was possibly ten times faster than this "modern" network implementation for single large bank account oriented table .... however for anything that was large, complex, and non-uniform ... DB2 couldn't touch it .... either inso complexity of the
query statements or in thruput/performance. The abstraction of how the paradigm was presented also made it much simpler to change and update the organization
(in addition to simple adding/deleting data) for complex organizations.

Along the way, I got to write code for both implementations ... help with things like tech transfer of system/r from blg. 28 to endicott for sql/ds, etc.

for some topic drift ... "sequel"
http://www.mcjones.org/System_R/SQL_Reunion_95/sqlr95-System.html#Index111

... from above ...

Don Chamberlin: So what this language group wanted to do when we first got organized: we had started from this background of SQUARE, but we weren't very satisfied with it for several reasons. First of all, you couldn't type it on a keyboard because it had a lot of funny subscripts in it. So we began saying we'll adapt the SQUARE ideas to a more English keyword approach which is easier to type, because it was based on English structures. We called it Structured English Query Language and used the acronym SEQUEL for it. And we got to working on building a SEQUEL prototype on top of Raymond Lorie's access method called XRM.

... snip ...

Lorie and I (and a couple others) transferred from scientific center to the west coast about the same time
http://www.garlic.com/~lynn/subtopic.html#545tech

now there is this other stuff out there that goes somewhat GML->SGML->HTML->XML, etc (somewhat analogous to the transition from SEQUEL->SQL) where GML was invented at the science center and the letters "G", "M", and "L" stand for initials of the people that invented it ... and the same Lorie (in the above) is the "L" in all those ML things floating around out there. Received on Thu Jan 13 2005 - 17:23:43 CET

Original text of this message