Home » RDBMS Server » Performance Tuning » implications of merging 17 tables into 1...
implications of merging 17 tables into 1... [message #247660] |
Tue, 26 June 2007 13:12 |
orausern
Messages: 826 Registered: December 2005
|
Senior Member |
|
|
Hi,
In our data model part of the applications' functionality is there in a group of 17 tables to 20 inter related tables. Now there is an initiative to redesign this piece entirely and the architect is suggesting just one mega table instead of these 17! What could be the pros and cons of this approach?
Does it mean that there will be more self joins for the same table and writing queries for getting required data will be more complex?
What would be the right approach to analyze such a proposal?
With thanks,
Nirav
[Updated on: Tue, 26 June 2007 13:19] Report message to a moderator
|
|
|
|
Re: implications of merging 17 tables into 1... [message #247666 is a reply to message #247663] |
Tue, 26 June 2007 13:25 |
orausern
Messages: 826 Registered: December 2005
|
Senior Member |
|
|
Thanks for the reply. This much I know: The existing data model causes some redundancy...lots of duplicate data gets saved..now if you put everything into one table, there is significant savings in terms of space.
But what about other aspects like performance etc? these data are interrelated...so there are foreign keys etc...if we put it all into one mega table, would it be worth it? This is denormalization carried to quite an extreme I guess, but I have to analyze this in some details so for any pointers will be very helpful to me.
|
|
|
|
Re: implications of merging 17 tables into 1... [message #247911 is a reply to message #247668] |
Wed, 27 June 2007 08:43 |
orausern
Messages: 826 Registered: December 2005
|
Senior Member |
|
|
Here is one possible scenario that describes to an extent, the changes: Say that there are 5 tables and they are related by 4 referential constraints (I am taking 4 as just one number, there could be more relations, but for example say 4). Now when we compress these 5 tables into one table, how is does the ref. integrity is taken care of? Does the data get exposed to some risk of corruption?
Perhaps in the same table, there can be ref. keys referring to the same table (like mgr referring to empno in the emp table of scott schema) but in that case, when we do that for more than 10 relations, wouldn't that introduce much complexity?
Can you please share your views on this.
Thanks,
Nirav
|
|
|
Re: implications of merging 17 tables into 1... [message #247923 is a reply to message #247660] |
Wed, 27 June 2007 09:38 |
|
Kevin Meade
Messages: 2103 Registered: December 1999 Location: Connecticut USA
|
Senior Member |
|
|
there is precious too little information here for a real answer. I might say this though:
1) why are you asking us? ask your Architect first why he/she thinks this one big table is such a good idea. Let him/her articulate the benefits expected, and how they will be measured. Then you can get lots of comments from this peanut gallery on Orafaq of what we think.
2) one can always fall back on what has been hinted at here already. The first and formost reason why "mega tables" or "generic" tables are looked down upon is because you loose any sense of what was actually modeled. A traditional model has tables for each thingy. The mega table approach does not. In my experience, this loss of entity identity has at least four initial ramifications:
a) you loose the ability to use referential constraints at the database level. You can't just declare a foreign key when you have generic columns, cause they fail. This leads to writing lots of special code in triggers (or god forbid in your application) to enforce RI, and you thus you recreate a feature that the database already provides.
b) you still need to tell people what is in the table and how to get data out of it without making a mistake. This leads to creating a considerable amount of meta data inside the database to describe what is in this mega table, along with more code that sits on top of it to make sure people don't fetch data out of it incorrectly. Naturally any data you load into the database must be maintained so there now is a burden here you didn't have. Make not mistake, one somebody starts deciding on the meta data they want to capture they will of course see lots of opportunity to "improve" upon a simple table definition. they will start trying to model business rules of various natures and it gets real complex. And once again you will have recreated a feature that the database already provides.
Hmm... you keep recreating features the database already provides, what was it you said about redundancy earlier? I'll bet your friends also want to create their own message queueing factility too (its just a guess, (can you spell Oracle advanced queues? (oh, but they think that is overkill for what they need right? so lets just write own which will be better and simpler and we can recreate a feature the database already provides))).
c) your datamodel can become a jumbled mess. The most common use of a mega table is to combine lots of "reference" tables that look like they have a common structure, into one table, the idea being that somehow fewer tables is better tables. But, this only leads to an outrageous number of foreign keys pointing back to the mega reference table (assuming you use surrogates). Everything in the end points back to this table multiple times. The only way to get a modeling tool (say like Erwin) to print a decent picture, is to remove foreign keys back to your mega table before you diagram. Otherwise the model makes no sense visually. This is a nother side affect of the loss of information that comes with "generic" models.
d) Loss of time. You will face two problems again of a practical nature when you use generic mega tables. 1) You will have a training obstacle. Contrary to popular belief, mega tables don't simplify the overall end result, they complicate it. You have to spend lots of time learning how to use it correctly, before you can create an working code (and anyone who says this is not true of overstated hasn't really lived with "generic" "mega tables" over the year). 2) And, you will spend countless hours arguing with DBAs and Programmers and Architects about which tables should be consumed by your mega table and which ones should not (what is a reference table anyway?). When a new table is modeled should it be folded into the mega table or not? blablabla? I can tell you for sure, non of these hours spent in meetings will be fun.). Maybe in the end you will define a rule that says when a table should go into the mega table so you don't have to argue about it. Of course the rule can't possibly consider the nature of the data first (but who models for real these days anymore).
No, I am not a big fan of generic mega tables. They have always been a programmers hack at modeling in my opinion.
But, I caution all, we should wait to see if I really understand anything of what you are trying to do. So, please post a few more details, and the reasoning your Architect provides for the change.
Kevin
|
|
|
Re: implications of merging 17 tables into 1... [message #248289 is a reply to message #247923] |
Thu, 28 June 2007 09:53 |
orausern
Messages: 826 Registered: December 2005
|
Senior Member |
|
|
Thank you very much for sharing your insights!! This is great and it is so useful to me as it clarifies so many of the concepts.
To continue on this, as it was very appropriately said:
Quote: |
But, I caution all, we should wait to see if I really understand anything of what you are trying to do. So, please post a few more details, and the reasoning your Architect provides for the change.
|
The reasoning for this change is:
the new design will result in simplification, reduction in storage space, and it will be more scalable and maintainable. The second point is backed with statistics...when the existing data was taken for in this new design via migration, there is an 80% reduction in storage space (and in no. of rows).
Now how this is done- here is a generic example of the way this thing is proposed:
Lets say we have following 5 tables:
EMPLOYEE_MASTER (emp_id pk , the rest of tables have this key and they are connected via ref. constraints to this table)
EMPLOYEE_FAMILY_INFO
EMP_PAYROLL_INFO
EMP_PAYROLL_DTL(related to emp_payroll_info..its a detail table for this one)
EMP_HOBBIES
EMP_HR_INFO
Now the rest of the five tables are all related to the first table, via primary key, and lets also say that EMP_PAYROLL_DTL is related to EMP_PAYROLL_INFO. This is a simplified analogy to the more complex actual thing (details of which I cant post as one can not give any actual detail in the forum, one can only refer a generic tech. problem. )
Now for this existing model, the 'new model' will be to have one table..lets call it MEGA_TABLE
So our MEGA_TABLE consist of which design : a bunch of columns all chosen together, taken from the rest of the four tables...and here is the main thing: a 'master' column (so to say), lets name it: META_COLUMN. Now this meta_column will contain a value like:
EMPLOYEE_FAMILY_INFO
EMP_PAYROLL_INFO
EMP_PAYROLL_DTL
EMP_HOBBIES
EMP_HR_INFO
which will serve to identify that 'this row' of the table actually consists of data of 'this component' like:
if this column has value 'EMP_HOBBIES' then we 'understand' (meaning the app will figure out via the extra coding etc) that this row is pertaining to only to the 'hobbies' related data and all other columns (which are relevant to other things like EMP_HR_INFO, EMP_FAMILY_INFO etc of the old model) of this MEGA_TABLE will be empty.
In this way this column no longer remains a column in the traditional sense of it(if I could put it that way), but becomes a sort of identifier for what used to be an entity in the earlier model. So in our traditional way: all rows of a table represent same type of info. but here in this model its no logner true! Each row represents a different type of info, all depending on this 'magical' 'META_COLUMN'. And this mega_table may have upto 70 or more columns , but only relevant columns will have data, others will be null.
So that is one generic example I could think of as analogy to something similar but much complex what is there in reality.
Again I am very thankful for this noble sharing of insights!
Nirav
[Updated on: Thu, 28 June 2007 09:57] Report message to a moderator
|
|
|
Re: implications of merging 17 tables into 1... [message #248335 is a reply to message #247660] |
Thu, 28 June 2007 12:28 |
|
Kevin Meade
Messages: 2103 Registered: December 1999 Location: Connecticut USA
|
Senior Member |
|
|
sorry, don't take this the wrong way, but I don't believe you.
More correctly I mean to say, that the example you have quoted is inadequate at best and not at all representative of the real proposal. There is nothing in the example you have given that in any way achieves your stated goals. I appreciate the time you took to give us this example, but it is not of any use because:
There is no simplification here. Indeed, this collapsing does just the opposite. Number of tables is at best a mediocre indication of complexity, and reducing number of tables by such a method as this, without regard for modeling of data is a bean counters way of adding things up into a nice neat box. But it won't simplify, it will complicate matters. You should be able to see this as obvious if only because you have already started to do the things I mentioned previously. You have started to create your own meta-data, and write interface code to interpret the real data based on this meta data. Sometimes a smart idea can be achieved with the help of a little meta-data and a few choice pieces of slick code, but that is not what you got here.
There is no space savings. The example presented does not reduce number of rows, or reduce average row length of the data. So where can someone possibly say there is a space savings? The amount of actual data stored has not changed. You have simply moved rows from one table to another. If indeed you did achieve a 5 fold reduction in size during migration, then there is something else you have not told us about. How for example did you manage to drop rows during your migration? There is no dropping of rows evident in the example you have provided. You do see this right? Maybe you are seeing space savings because your current database tables are poorly managed and you could recoup the lost space simply by changing some storage properties, or taking a look at unused indexes? Or maybe there is more to the story? Also, just how much space are we talking about; a couple hundred MB or a couple hundred GB? Hey who knows, maybe somebody turned database compression and compressed your new table and/or indexes and forgot to mention that fact.
There certainly is no easier management. How does more complicated data storage and more complicated data access result in an expectation of easier management?
I am sorry but I ain't seeing it. What else did you want to tell us?
Again too, did you ask your Architect for clarification? If you are really concerned, show him this email trail and tell him you are interested in the apparent difference of opinions being presented. If your Architect is worth his/her salt, the he/she will be happy to give you 30 minutes to explain why the migration is going to be great, with some concrete examples. If they instead refuse, or worse yet, get mad that you even asked, then at least you know they are full of it and you need to escalate to your management.
So Nirav, what else do we need to know to help, and tell us what your Architect said.
Kevin
|
|
|
|
Re: implications of merging 17 tables into 1... [message #248349 is a reply to message #248338] |
Thu, 28 June 2007 13:09 |
orausern
Messages: 826 Registered: December 2005
|
Senior Member |
|
|
The analogy I had given was only for the purpose of explaining how different entities are to be merged into one. Not for explaining how the space savings are achived. This is another detail, and I haven't yet got onto it. Anyway, thank you for all the information. I think I have lot more clarity on a few imp. concepts regarding which I wanted some help.
[Updated on: Thu, 28 June 2007 13:11] Report message to a moderator
|
|
|
|
|
Goto Forum:
Current Time: Sat Nov 23 08:22:19 CST 2024
|