Re: Mixing OO and DB

From: Brian Selzer <brian_at_selzer-software.com>
Date: Sun, 02 Mar 2008 21:18:58 GMT
Message-ID: <6REyj.14660$Ej5.10875_at_newssvr29.news.prodigy.net>


"Patrick May" <pjm_at_spe.com> wrote in message news:m2bq5xylnc.fsf_at_spe.com...
> frebe <frebe73_at_gmail.com> writes:

>> On 2 Mar, 15:45, Patrick May <p..._at_spe.com> wrote:
>>> frebe <freb..._at_gmail.com> writes:
>>> >> >>      You've asked this before and it has been fully answered.
>>> >> >> The two components change at different rates for different
>>> >> >> reasons, especially in distributed applications and when the
>>> >> >> database supports multiple applications.
>>>
>>> >> > The only answer so far is about "denormailzation for
>>> >> > perforance", which is a very debatable argument.
>>>
>>> >>      It's not debatable at all.
>>>
>>> > Ok, show me an example of such schema change that wouldn't affect
>>> > the application code.
>>>
>>>      Denormalization for performance is just such a schema change.
>>> The data supported by the schema doesn't change, only the format.
>>> If you have SQL littering your application code, all of that has to
>>> be changed and retested.  If you have an O-R mapping layer, for
>>> example, only the mapping code needs to change.
>>
>> Still no example. I was asking for the schema before and after such
>> change. Then we could evaluate if the application code would have to
>> change too, or not.
>

> I'm not sure why you need an example here. Have you never seen
> denormaliztion done for performance reasons? Do you not realize the
> the entire point of doing so is to reduce the bottleneck posed by the
> database without changing the external behavior of the business logic
> that uses the database?
>

> One common denormalization technique is to eliminate a join by
> adding columns to one table that replicate data in another table that
> has a one-to-many relationship with the table being denormalized. The
> business logic shouldn't have to change just because the schema does.
>

Why denormalize when you could just use a materialized view?

>>> >> In addition, when a database supports multiple applications, the
>>> >> schema may need to change for one application but the others
>>> >> should be insulated from that change.  This is common in many
>>> >> enterprises.
>>>
>>> > Addning columns or tables wouldn't affect the other (old)
>>> > applications. If you have other changes in mind, please show an
>>> > example.
>>>
>>>      Changes to the schema to support new regulatory requirements
>>> are not uncommon.  Sometimes these are simple additions, but they
>>> can also involve splitting or, less commonly, merging existing
>>> columns.  Changing ordinality is another not infrequent change; an
>>> existing application can continue to function by using the primary
>>> or highest priority element on the many side of the relationship
>>> but a new application or component can take advantage of the full
>>> set.
>>
>> Please give one example (schema before and after).
>

> Again, why? Have you never experienced these kinds of changes?
> They happen often enough in my, and others, experience that the
> decoupling provided by OO techniques has proven useful. Despite what
> some relational proponents think, developers don't come up with these
> approaches because they dislike or don't understand relational
> databases. They do it because they're trying to solve real problems.
>

> Consider a regulatory change like Sarbanes-Oxley. Companies need
> to add additional monitoring rules, but they have existing
> applications in place that they don't want to change any more than
> absolutely necessary due to the high cost of doing so. The
> applications can be encapsulated to work without change within a
> different workflow. That workflow requires additional information in
> the database, in particular more detailed audit information and finer
> grained tracking of individual steps in the process. This results in
> schema changes but does not change the behavior of existing
> applications.
>
>>>      During development, many shops are using iterative methods
>>> that require changes to both the schema and the application.  When
>>> there is more than one project ongoing at the same time, insulating
>>> the application from the schema allows the new changes to be
>>> incorporated when it makes sense for the application team, rather
>>> than requiring all teams to change in lock step (a sure recipe for
>>> schedule slip).
>>
>> This is all very nice words, but at least we need some examples to
>> verify it.
>

> Particular schema examples will not help. Either you've
> experienced this or you haven't. If you haven't, you should at least
> be able to understand that these kinds of environments do exist and
> these kinds of changes do take place. This is not a discussion about
> particular schema changes, it is about the value of decoupling.
> Talking about specific schemas will take this down a rathole and miss
> the entire point.
>
>>> >> > Yes it is indeed bad design to force every SQL statement to be
>>> >> > wrapped inside a method.
>>>
>>> >>      So you prefer littering your code with embedded SQL?
>>>
>>> > I wouldn't write exactly the same SQL statement in many different
>>> > places, in that case I would use a function. But I would wrap
>>> > every SQL statement in a function.
>>>
>>>      Did you mean to say that you _wouldn't_ wrap every SQL
>>> statement in a function?  If so, you wouldn't pass a code review in
>>> my teams.
>>
>> Happily I don't belong to your team....
>>
>>> The maintenance cost of not doing so is too high.
>>
>> When someone fail to support his methods about software development,
>> he can always claim his methods has lower maintenance costs, since
>> it is almost impossible to verify such argument.
>

> This is where experience comes into play. No one I know who has
> actually developed a large system using OO techniques would ever embed
> SQL directly in the application logic. They've personally experienced
> the costs of doing so and don't want to go through it again.
>
>>>  >> This isn't even an issue for the systems I work on.
>>>
>>> > Why not?
>>>
>>>      I've recently been working on low latency systems in the
>>> financial services industry.  We have to keep the database, and any
>>> other disk based storage, completely out of the critical path of
>>> the business transaction or we won't meet our performance
>>> requirements.
>>
>> What kind of "business transactions"?
>

> Most recently a large risk analytics engine. The business
> transactions, or business use cases, if you prefer, involve
> calculations like value at risk for portfolios containing hundreds of
> thousands of positions. The only way to get acceptable latency is to
> co-locate the business logic, messaging mechanism, and data in the
> same operating system process. The only way to get acceptable
> throughput is to run dozens of those processes in parallel. Hitting a
> database would make the system unusable.
>

> Another example is determining the impact of faults in a
> telephone network. Probes return a large amount of information, only
> some of which is pertinent. Raising alerts that make sense at the
> business level, like "The SLA for Big Customer's video conference
> service is about to be violated." requires several levels of analysis
> and aggregation. That has to be done at in memory speeds.
>

> What kind of systems do you build? Perhaps we can find some
> common problem domains that would make this conversation more
> productive.
>

> Sincerely,
>

> Patrick
>

> ------------------------------------------------------------------------
> S P Engineering, Inc. | Large scale, mission-critical, distributed OO
> | systems design and implementation.
> pjm_at_spe.com | (C++, Java, Common Lisp, Jini, middleware, SOA)
Received on Sun Mar 02 2008 - 22:18:58 CET

Original text of this message