Re: : RE: bytes vs chars
Date: Thu, 17 Mar 2016 08:49:01 -0600
Message-ID: <56EAC3DD.8080302_at_1001111.com>
I thought I sent this yesterday but ......
First we are mixing design and implementation which is never good.
On design:
On 03/15/2016 02:37 PM, Robert Freeman wrote:
> Oh, this topic gives me heartburn… Both sides have points, but I also think both sides are missing THE point. Data types and lengths are constraints – and should be treated as such.
>
> Neil - I don’t think anyone is suggesting that we make every column something on the order of a VARCHAR(2000) – just in case. I would call that an extreme case and in any discussion of worth - I
> think we throw out the extreme cases and send any DBA that supports them back to DBA 101 and basic data modeling. J(Hopes I have not just inadvertently sent Dave to the woodshed).
I in no way claim make every column VARCHAR2(2000). What I do want to say is do not put artificial constraints on that prevent storage of valid data. If the maximum possible length is unknown give the data lots of space, at least double what you think is needed. And don't worry about the woodshed, mine is quite large, very luxuriously appointed and even has an attached doghouse, also quite luxurious.
Snip ...
> *Why do I say best practices? It’s because I think we need to look at the size of a column**as a form of constraint*. *No different than an FK, PK, check or not null constraint.* We tend to be very
> methodical when creating these kinds of constraints, and I see no reason why we would not do the same with respect to column definitions. I think it’s important to pay attention to the notion of data
> type sizing as a constraint. Why?
>
> *Constraints have incredible value in that they protect the integrity of the data. Column length and type semantics are central in that role.*
I agree 100%, but is the maximum length of a column an inherent characteristic of the attribute we are storing data about? Or is it artificial? Barcodes, UWIs, IP addresses, MAC addressees are all examples of data where the maximum length is part of the definition or specification. Anyone who deals with data from machinery such as SCADA, CNC or PLC would be foolish to assign even an extra byte since the data formats are rigid and defined.
However, names, whether people, places or things, do not have a maximum length. There is no point in imposing an artificial constraint in the design stage. I recognize we must impose an upper limit in the implementation. Where attributes do not have an inherent maximum length I am a big fan of standard sizes. Please note I am not claiming my sizes are the best.
VARCHAR2(64) for names, 1-n words where n is fairly small "The Friendly Giant" VARCHAR2(255) for free form text of sentence length "The package will be picked up on Friday at 0900H" VARCHAR2(1000) for free form text of paragraph length, descriptions and comments.
Even this will not cover all cases. If running a global tourism site place names should be allowed to be 255 characters, a chemical compound database will need name fields of at least 1000 characters. Product names will often exceed 64 characters. Please don't make the new length VARCHAR2(80). Sigh.....
Anything larger than 1000 characters is affected by implementation and should be analyzed in depth, size, inline storage, use of LOBs, etc. The physical environment has a huge impact on large columns. Yet I find these fields are often the ones that have the least work put into their design.
On Implementation:
Jonathon and Neil, thank you very much for the posts and blog about the excessive redo issue. I was on holiday when the discussion was on the list so I missed it. I have always tried to avoid row chaining, now I have learn't another reason to keep a row within the blocksize..
However, I still go to my claim that this is the sort of problem that should be detected in testing. If I am pounding in 1 or 10 rows per second then the excessive resource usage should be obvious and must be fixed. If I am inserting 1 row per hour should I worry about it?
To misquote Knuth "premature optimization is the root of all evil."
Problems found in test do not affect production. Failure to store valid data does affect production.
YMMV
Dave
-- Dave Morgan Senior Consultant, 1001111 Alberta Limited dave.morgan_at_1001111.com 403 399 2442 -- http://www.freelists.org/webpage/oracle-lReceived on Thu Mar 17 2016 - 15:49:01 CET