Re: Yes, this is probably a stupid question but...

From: Pat <pat.casey_at_service-now.com>
Date: Sat, 12 Apr 2008 18:58:22 -0700 (PDT)
Message-ID: <6b36394a-3e8e-4b25-81ba-1b3eddddd1a9@k13g2000hse.googlegroups.com>


On Apr 12, 5:51 pm, hpuxrac <johnbhur..._at_sbcglobal.net> wrote:
> On Apr 12, 8:01 pm, Pat <pat.ca..._at_service-now.com> wrote:
>
>
>
> > On Apr 12, 4:44 pm, "Ana C. Dent" <anaced..._at_hotmail.com> wrote:
>
> > > Pat <pat.ca..._at_service-now.com> wrote in news:0653963d-1d32-4e85-9bae-
> > > 55e038b9e..._at_f63g2000hsf.googlegroups.com:
>
> > > > Question I have is:
>
> > > > Short of constructing a summary table (which is probably a non starter
> > > > in this environment), is there anything I could to to speed up the
> > > > return time of a count query like the above?
>
> > > Ready, Fire, Aim!
>
> > > If you don't know specifically where the time is being spent,
> > > you'd only be guessing what change is most optimal.
>
> > > Run your query with SQL_TRACE=TRUE & run the results thru tkprof.
>
> > > Post the results & EXPLAIN PLAN back here is you still need more asisstance
>
> > I'm working over a remote desktop at the moment so I can't cut and
> > paste the plan so I'll transcribe:
>
> > select count(*) as recordcount from (a inner join b on a.sys_id =
> > b.sys_id) where a.active = 1
>
> > Table has aprox 180k rows, of which 151k have active = true
>
> > SELECT STATEMENT
> > _SORT AGGREGATE
> > __NESTED LOOPS
> > ___INDEX RANGE SCAN (index on a.active, a.sys_id)
> > ___INDEX UNIQUE SCAN (index on b.sys_id)
>
> > Total Query time: 2.39 seconds
>
> > I've got a covering index on the primary table and an index on the
> > child table's join condition, so there's no actual table reads going
> > on here e.g. it's just an index join. I have current stats on the
> > table (as of last night). From what I can see this is not an absurdly
> > bad optimizer plan, but then I suspect there's something I'm not
> > seeing, hence my post here.- Hide quoted text -
>
> > - Show quoted text -
>
> The nested loops technique works pretty well with a limited volume of
> matching records but if you are really getting counts of 200k at times
> well it's a non-performer from the get go.
>
> A hash join might perfom a ( little ) better ... but still may not be
> adequate.
>
> What release level and patchset are you running on exactly? What kind
> of hardware platform?
>
> Does the application really require an absolutely accurate count in
> these circumstances and if so how is this accurate count going to be
> used?
>
> Re-design may be called for here or at least some bypass type
> techniques.
>
> If absolutely correct counts are not required one way of thinking
> about a bypass is limiting the count to some base number of rows via
> rownum ( select count(*) from whatever the tables are with join
> condition AND rownum <= 10,100,1000 ).
>
> Another bypass technique might be to create a table that has counts
> already calculated and refresh that table on a periodic basis. Then
> point the query against the pre-aggregated table.

Unfortunately (from a database efficiency standpoint), exact counts are required, or at least expected by our current user base. Summary tables are probably not going to work either since I can't predict what the count is going to be over. In this case, this user is hammering the subset active=1, but tomorrow it may be based on some other specifier.

Switching this to a hash join sounds like it has promise (or at least be worth a shot).

Is there an optimizer hint I can add to force a hash join in this case? Be worth a shot.

I have to admit I am a bit surprised this is all that slow though. It's a modern box with 3.0 ghz CPUs in it (intel) and virtually no load apart from this query. He can basically monopolize an entire core (and he does according to vmstat), and it's still taking him about 2.2 seconds to finish. Seems like that's an aweful lot of CPU time just to join a couple of hundred thousand index nodes together and count the result. Virtually no physical IOs involved here either. Far as I can tell, this is pure CPU load.

10.2.0.4 64 bit on RHEL 5.0. Received on Sat Apr 12 2008 - 20:58:22 CDT

Original text of this message