Re: EAV (Re: Object-relational impedence)

From: Brian Selzer <brian_at_selzer-software.com>
Date: Sun, 30 Mar 2008 03:55:56 GMT
Message-ID: <gbEHj.17136$5K1.12616_at_newssvr12.news.prodigy.net>


"topmind" <topmind_at_technologist.com> wrote in message news:988c8429-32c9-4aa6-a913-c241da9eba32_at_h11g2000prf.googlegroups.com...
>
>
> Brian Selzer wrote:
>> "topmind" <topmind_at_technologist.com> wrote in message
>> news:f743faa4-962b-4048-9d3d-203951b57e17_at_e6g2000prf.googlegroups.com...
>> >
>> >
>> > Brian Selzer wrote:
>> >> "topmind" <topmind_at_technologist.com> wrote in message
>> >> news:1f649255-eba1-4e82-82b1-99228e761fa5_at_c19g2000prf.googlegroups.com...
>> >> >
>> >> >
>> >> > Brian Selzer wrote:
>> >> >> "topmind" <topmind_at_technologist.com> wrote in message
>> >> >> news:79cae24d-bca7-47e5-94f6-7bb5bfa508f1_at_s8g2000prg.googlegroups.com...
>> >> >> >
>> >> >> >
>> >> >> > Brian Selzer wrote:
>> >> >> >> "topmind" <topmind_at_technologist.com> wrote in message
>> >> >> >> news:a8855c5d-1e3f-46ef-a99a-5c09b94bfba5_at_u10g2000prn.googlegroups.com...
>> >> >> >> > (addendum)
>> >> >> >> >
>> >> >> >> > For an example of the need for flexible columns/attributes,
>> >> >> >> > consider
>> >> >> >> > a
>> >> >> >> > relational-based GUI system where widget state and info are
>> >> >> >> > tracked
>> >> >> >> > in
>> >> >> >> > some kind of RDBMS. We want it to be able to add new widgets
>> >> >> >> > from
>> >> >> >> > different vendors without each widget needing its own table(s)
>> >> >> >> > (unless
>> >> >> >> > its a really special need).
>> >> >> >> >
>> >> >> >> > There seems to be a need for either a EAV table (row-based
>> >> >> >> > dynamism)
>> >> >> >> > or Dynamic Relational (column-based dynamism) of some kind so
>> >> >> >> > that
>> >> >> >> > attributes that are not known up-front can be tracked.
>> >> >> >> >
>> >> >> >> > If you suggest otherwise, please present your design.
>> >> >> >> >
>> >> >> >>
>> >> >> >> Well it's really simple. You have a single table with
>> >> >> >> attributes
>> >> >> >> that
>> >> >> >> are
>> >> >> >> common to all possible widgets, and when a set of attributes
>> >> >> >> that
>> >> >> >> are
>> >> >> >> not
>> >> >> >> known up front needs to be tracked, a new table or set of tables
>> >> >> >> is
>> >> >> >> created
>> >> >> >> to house the set of attributes that references the table with
>> >> >> >> the
>> >> >> >> common
>> >> >> >> attributes. It's really very simple and doesn't require a
>> >> >> >> special
>> >> >> >> dynamism
>> >> >> >> (Is that the correct use of that word?) mechanism.
>> >> >> >
>> >> >> > Having different vendor widgets being able to create their own
>> >> >> > little
>> >> >> > table is not very practical. There are versioning issues, for
>> >> >> > one.
>> >> >> > What if a widget creates a table that does not work with the
>> >> >> > current
>> >> >> > version or brand of our GUI RDBMS?
>> >> >> >
>> >> >>
>> >> >> Since when do widgets create their own tables?
>> >> >
>> >> > I thought that was *your* suggestion. May I suggest you describe a
>> >> > sample scenario to reduce unstated assumptions between us.
>> >> >
>> >>
>> >> Widgets don't necessarily need to know how to serialize themselves.
>> >> They
>> >> only need to know how to provide a snapshot of themselves for
>> >> serialization.
>> >> Some other part of the system may be responsible for committing that
>> >> snapshot to the database.
>> >>
>> >> >>
>> >> >> > And, I find gazillion tables hard to navigate when looking for
>> >> >> > tables
>> >> >> > or debugging. I never liked table-happy designs, but individuals
>> >> >> > may
>> >> >> > have personal preferences that differ.
>> >> >> >
>> >> >>
>> >> >> For me, it's a lot easier to wade through a gazillion tables than
>> >> >> to
>> >> >> try
>> >> >> to
>> >> >> decipher the content in an EAV table.
>> >> >
>> >> > That's you, not me. In this case I prefer EAV.
>> >> >
>> >> >> The 'table-happy' design permits the
>> >> >> definition of relationships between attributes.
>> >> >> That is a whole lot more
>> >> >> difficult when everything is stuffed in an EAV table. For example,
>> >> >> suppose
>> >> >> that every widget that has attribute Y also has attribute Z. Try
>> >> >> enforcing
>> >> >> that rule with everything stuffed in an EAV table. With the
>> >> >> 'table-happy'
>> >> >> design, all you need is to specify that attributes Y and Z are in
>> >> >> the
>> >> >> same
>> >> >> table.
>> >> >
>> >> > I don't see why that would be a common scenario for GUI's. I cannot
>> >> > even think of a single realistic case right now. I am not saying it
>> >> > will never happen, but its not common enough to tilt the decision
>> >> > unless its close.
>> >> >
>> >> > Plus, lots of similar tables often results in having to do UNION
>> >> > queries, which are ugly and slow. With EAV, one could get/dump the
>> >> > GUI
>> >> > attributes in a single query. With your myriad table suggestion, one
>> >> > would have to first take an inventory of all custom-widget tables.
>> >> >
>> >>
>> >> I don't see why you would need to do union queries. I think you're
>> >> stuck
>> >> on
>> >> the idea that each widget type has its own table. That's not at all
>> >> what
>> >> I
>> >> am suggesting. There is a table for attributes that are common to all
>> >> possible widgets, and a table for each set of attributes that are
>> >> common
>> >> not
>> >> to all possible widgets, but rather to a subset of all possible
>> >> widgets.
>> >> So
>> >> for example, suppose that widget A has attributes T, U, V, and W, and
>> >> widget
>> >> B has attributes T, U, V, X, and Y, and widget C has attributes T, U,
>> >> V,
>> >> Y
>> >> and Z, you wouldn't necessarily have a table for each type of widget.
>> >> Assuming that T, U, and V are common to all widgets, you would have
>> >> one
>> >> table, the common table, with attributes T, U, and V. Let's also
>> >> assume
>> >> that values for T uniquely identify a widget, meaning that T would be
>> >> the
>> >> primary key of the common table. Then you would have a table with T
>> >> and
>> >> W,
>> >> a table with T and X, one with T and Y, and one with T and Z, each
>> >> referencing the common table. Widget A would have a row in the common
>> >> table
>> >> and one in the table with T and W; widget B would have a row in the
>> >> common
>> >> table, one in the table with T and X and one in the table with T and
>> >> Y;
>> >> widget C would have a row in the common table, one in the table with T
>> >> and Y
>> >> and one in the table with T and Z. When a new widget, D, that has
>> >> attributes T, U, V, X and P needs to be recorded, a new table with T
>> >> and
>> >> P
>> >> would need to be created, and then widget D would have a row in the
>> >> common
>> >> table, one in the table with T and X and one in the new table with T
>> >> and
>> >> P.
>> >
>> > How is this better than an EAV table? At least with EAV's, you don't
>> > have to create new tables for columns that don't already exist.
>> >
>>
>> You're kidding yourself if you think there is any less work with an EAV
>> table. So what if you have to create a new table for columns that don't
>> already exist. Unless you don't care about integrity, you will have to
>> alter the constraints on an EAV table whenever you add an attribute that
>> doesn't already exist. The columns in each new table are typed. The
>> values
>> in the EAV table cannot be, otherwise you wouldn't be able to store names
>> and addresses and numbers and images in the same table. This means that
>> for
>> each attribute, you have to define a constraint that specifies the values
>> allowable for that attribute. There are other reasons that this is
>> better.
>> I listed one earlier: if whenever A, B, then put A and B in the same
>> table.
>> Another is that indexes can be applied to individual columns or groups of
>> colums. Constraints are simpler, and therefore easier to read,
>> understand
>> and maintain, not to mention that being on separate objects, they can
>> operate independently, improving both performance and scalability.
>
> A bunch of joins is not exactly fast. I don't believe you.
>

Where did you come up with a bunch of joins? Even so, a merge join even over multiple tables has linear complexity when the inputs are already sorted.

>> If you
>> have hundreds of different attributes, then you have hundreds of sets of
>> values, and therefore hundreds of additional compares just to determine
>> which constraints apply. I could go on, but it should be obvious that
>> you
>> don't really gain anything by using an EAV table, you just end up
>> reinventing the wheel: instead of using the type checking that is built
>> into
>> the system, you have to roll your own.
>
> I am not much of a fan of types. A data dictionary can be used for
> validation if needed. Anyhow, this sounds like a perpetual argument
> that keeps repeating the same arguments over and over. Each solution
> has its own drawbacks and compromises.
>

Some more than others.

> And *if* new tables are to be created, I think dedicated tables per
> widget, other than the shared ones, is preferable to one table per new
> column.
>

To each his own. Can you offer any justification for what you think, or is it just your gut feeling.

> -T-
Received on Sun Mar 30 2008 - 05:55:56 CEST

Original text of this message