Messy SQL performance question (NOT in SELECT clause)

From: Roger Loeb <rloeb_at_martech.com>
Date: Wed, 15 Apr 1998 11:13:00 -0600
Message-ID: <6h2pqs$85m$1@news1.rmi.net>

I have an application that does queries on a 400 million (yes) table. There are two queries run sequentially. The program is looking for "candidates" within clusters of rows. The table is clustered on the primary key, which is also the key used in the first of the two queries. A typical first query may return as many as 100,000 rows. The second query is only used when the first query does not find a suitable candidate. It uses a separate index to select rows that were NOT selected in the first query.

The form of the first query is "SELECT /*+ INDEX primary-index */ column-names from table-name where PRIMARY-KEY = :value"

The second query is of the form "SELECT /*+ INDEX secondary-index */ column-names from table-name where SECONDARY-KEY = : value2 AND NOT PRIMARY-KEY = :value"

The purpose of the NOT is to avoid retrieving all of the rows that were already read in the first query.

The problem: most Oracle tuning books indicate that any use of a NOT condition will result in a full table scan. That's rather a problem when there are 400 million rows. I'm just trying to avoid the overhead of selecting all those rows and testing them internally for duplication with the first query. Seems to me that, at least theoretically, a join of the two indices identifies those records to NOT select, but I do appreciate the concept of a negative join.

Are the tuning manuals largely correct, i.e., does this form result in a full table scan? Can anyone suggest an alternate way to reduce the volume from the second query?

Thanks,

Rog

--
roger@_delete_this_to_reply_.martech.com Received on Wed Apr 15 1998 - 12:13:00 CDT