Re: Object-relational impedence
Date: Tue, 04 Mar 2008 19:14:48 GMT
Message-ID: <Ichzj.14289$e_.9713_at_trnddc03>
>> All attempts by applications to access a DB's tables and columns >> directly violates design principles that guard against close-coupling. >> This is a basic design tenet for OO. Violating it when jumping from OO >> to RDB is, I think, the source of problem that are collectively and >> popularly referred to as the object-relational impedance mismatch.
>
> I wondered if we might be able to come up with some agreement on what
> object-relational impedence mismatch actually means. I always thought
> the mismatch was centred on the issue that a single object != single
> tuple, but it appears there may be more to it than that.
First, I think it is important to clarify that the 'relational' in the mismatch isn't referring to the fact that the OO paradigm uses something other than set theory's relational model. The nature of the impedance mismatch lies in the way the OO and RDB paradigms implement the same relational model.
I think the lack of 1:1 tuple mapping is just a symptom of the mismatch.
There are several contributors to the mismatch...
Applications (not just OO) are designed to solve specific problems so
Object properties include behavior. Behaviors interact in much more complex ways than data. Managing behaviors is the primary cause of failing to map 1:1 between OO Class Diagrams and Data Models of the same subject matter. That's because managing behavior places additional constraints on the way the software is constructed.
OO relationships are instantiated at the object (tuple) level rather than the class (table) level. This allows much better tailoring of optimization to the problem in hand. It also focuses on capturing business rules and policies in the way relationships are instantiated. That, in turn, emphasizes preselecting sets of entities before they are actually accessed. Thus query-like searches for object collaborations are relatively rare in well-formed OO applications.
Corollary: the OO paradigm navigates relationship paths consisting of individual binary associations and sequentially processes object sets resulting from such navigation. Thus there is no direct equivalent of an RDB join in OOPL or AAL syntax. (One can argue that the query/join approach is less tedious, but the OO paradigm has additional goals to satisfy, such as limiting access to knowledge.)
Object identity is usually not explicitly embedded as an attribute of the object; OO applications are designed around address-based identity in computer memory. This profoundly changes the way one manages referential integrity. Thus OO developers will avoid class-level identity searches whenever possible.
The relations in OO generalizations cannot be instantiated separately; a single tuple resolves the entire generalization. This is the one situation where a Class Model and a Data Model can never map 1:1. The reason lies in the OO paradigm's support of polymorphism.
> I was hoping perhaps people might be able to offer perspectives on the
> issues that they have encountered. One thing I would like to avoid
> (outside of almost flames of course), is the notion that database
> technology is merely a persistence layer (do people still actually
> think that?) - I wonder if the 'mismatch' stems from such a
> perspective.
The short answer is that any OO application developer sees the DBMS as an implementation of a persistence layer.
I think it is important to distinguish between pure persistence in the form of an RDB and a bundle of specialized server-side applications that are layered on top of an RDB and form a DBMS. Some CRUD/USER processing can be quite complex, such as data mining, but from the end customer's perspective all the server-side applications are providing is data access and formatting.
Similarly, it is important to distinguish between CRUD/USER processing and other problems. In CRUD/USER processing the only problems being solved for the customer are data entry, data selection, and conversion to a convenient display representation. The RAD IDEs and layered model infrastructures already handle that sort of processing quite well (e.g., it is no accident that they employ form-based UIs that conveniently map into RDB tables) and applying OO development there would be largely redundant.
Thus OO developers always believe that a database is a persistence mechanism because they deal with problems outside CRUD/USER processing. IOW, the OO application's solution *starts* with accessing data from a persistent store and *ends* with shipping results off for display rendering. That problem solution doesn't care what kind of data access services the DBMS may provide; it just wants to access and store particular piles of data. Similarly, it doesn't care whether user communications are via GUI, web browser, or heliograph.
To put it more bluntly, from the OO application's solution perspective, the developer couldn't care less that the data was mined from multiple sources using exotic algorithms or whether it is stored in an RDB, an OODB, flat files, or on clay tablets. At the level of abstraction of the OO problem solution, only two services are required: "Save this pile of data I call 'X'" and "Give me the pile of data I call 'X'". Thus the entire interface for accessing persistence from an OO application's problem solution is typically just three messages of the form {message ID, [data packet]} that might look something like:
{SAVE_DATA, data ID, dataset} // to persistence
{GET_DATA, data ID} // to persistence
{HERE_IS_DATA, data ID, dataset} // response from persistence
The application solution will provide its own unique encode/decode of the message data packets into its objects and their attributes that is completely independent of the persistence schemas, etc.. Bottom line: the DBMS may provide all sorts of elegant CRUD/USER access services but the OO application doesn't care about that; that belongs to a different trade union.
<aside>
As a practical matter, the client-side does care because somehow those
messages need to be mapped into the server-side DBMS services (e.g.,
creating SQL queries, performance caching, and optimizing joins for the
DBMS). But to do that one only needs to provide the mapping once in a
subsystem that is reusable by any application that accesses that DBMS.
Typically that subsystem would be designed and implemented by someone
who has specialized DBA skills to utilize the DBMS services in an
appropriately clever fashion. IOW, the subsystem represents a
fundamental separation of concerns from the specific problem solution by
isolating and encapsulating specific mechanisms and optimizations
related to persistence access.
Note that when developing large OO applications, one does this sort of
subsystem encapsulation for *all* subsystems within the application; UI
and DB subsystems just happen to be ubiquitous concerns. One does OO
development because one wants maintainable applications. Hence
separation of concerns and encapsulation at the subsystem level is
critically important for decoupling implementations in different parts
of the application.
</aside>
-- There is nothing wrong with me that could not be cured by a capful of Drano. H. S. Lahman hsl_at_pathfindermda.com Pathfinder Solutions http://www.pathfindermda.com blog: http://pathfinderpeople.blogs.com/hslahman "Model-Based Translation: The Next Step in Agile Development". Email info_at_pathfindermda.com for your copy. Pathfinder is hiring: http://www.pathfindermda.com/about_us/careers_pos3.php. (888)OOA-PATHReceived on Tue Mar 04 2008 - 20:14:48 CET