Re: Design table with many columns

From: Mladen Gogala <gogala.mladen_at_gmail.com>
Date: Fri, 5 Jul 2024 16:52:31 -0400
Message-ID: <2c6d0d67-5b4c-4ce4-b4b6-e0ba13d78aee_at_gmail.com>


On 7/5/24 12:28 PM, richard goulet wrote:
    This is a pretty old issue that has been around for many a decade, even before DBMS's appeared.  The solution from back then is to assign each transaction some type of ID that can then have a header table as well as a multi row spec table where each distinct and variable attribute can be stored without making a mess of the RDBMS.  Let me go back to a VERY OLD configuration I worked on and with:

    We had a header table with a NSN, NOUN, and other associated columns and a SPECS table that has the associated NSN column, a SPEC_NAME and a VALUE columns as well as others that held the variable data.

    Some NSN's had 10 specs rows while some has several hundred depending on what you were describing.  The NSN was the primary key in the first table and a foreign key in the second.

Did something similar in civilian life afterwards for electronic modules in a manufacturing environment and that ported nicely between databases as well so we could have Oracle internally and Postgres on the external web site.  Same queries worked on both sides.

Yes, the problem is solvable by data modelling. Having a table with gazillion of columns is usually bad for performance. The refined models, brought to the 4th normal form, should never have tables with a huge number of columns. Googleplex is the limit. However, I find that packing related fields into a BLOB JSON column (23f will have a separate data type for JSON, and one for Boolean, too in the next century, when it arrives) is actually quite handy. I've used it to pack together things like addresses. Instead of creating several columns, you can pack them into something like this:

{ "street":"1600 Pennsylvania Ave",

  "apt": NULL,

  "state": "Washington DC",

  "zip": 20500}

That can alleviate the need for creating the columns for  addr_line1,addr_line2, state and zip. This is better than creating separate data types in the OO style because JSON can be parsed by almost all modern programming languages like Java or Python, both OO languages. I am not sure whether COBOL can parse JSON, but I wouldn't be surprised to find out that it can. The "traditional" OO solution would be to bundle a bunch of columns into an OO data type and to also bundle handling procedures into the data type. However, passing those objects to languages like C is rather inconvenient.  JSON structures, on the other hand, can be parsed by almost anything so it's much easier to program with that. The problem of interfacing with the external applications is something that held back OO within database. JSON is something similar, but more convenient.

I swear to use the key, the whole key and nothing but the key, so help me Codd.

--
Mladen Gogala
Database Consultant
https://dbwhisperer.wordpress.com
-- http://www.freelists.org/webpage/oracle-l Received on Fri Jul 05 2024 - 22:52:31 CEST

Original text of this message