Home » RDBMS Server » Performance Tuning » query problem
query problem [message #194626] Sat, 23 September 2006 01:36 Go to next message
mjschwenger
Messages: 137
Registered: July 2005
Location: US
Senior Member
I'm trying to prevent the FTS on a big table, but it seems not to get any better.
In the original production query1(below), I do not see any reason for the UNION, so I removed it, but I still see FTS on sep_relation as I'm selecting * and obviously the enity_id is not enough limiting condition. Query2 is simplified from query1. In Query3 and 4 I try to fix it, but it doesn't get any better...
If somebody could give an idea how to work around the problem, I would greatly apprecited.
All I want is to select * from the table where relto_id = 10 or Entity_id=10 and the type from the third table is not 'Search'.
Thanks a lot, mj

Query1:
SELECT s.*
FROM sep_relations s, entity e, entity e2, entity_type et, entity_type et2
WHERE s.sys_delete_dt IS NULL
and s.relto_id = e.entity_id
and e.entity_type_id = et.entity_type_id
and et.entity_type <> 'SEARCH'
and s.entity_id = e2.entity_id
and e2.entity_type_id = et2.entity_type_id
and et2.entity_type <> 'SEARCH'
union
SELECT s.*
FROM sep_relations s, entity e, entity e2, entity_type et, entity_type et2
WHERE s.sys_delete_dt IS NULL
and s.entity_id = e.entity_id
and e.entity_type_id = et.entity_type_id
and et.entity_type <> 'SEARCH'
and s.relto_id = e2.entity_id
and e2.entity_type_id = et2.entity_type_id
and et2.entity_type <> 'SEARCH'

Q2:
SELECT s.*
FROM sep_relations s, entity e, entity e2, entity_type et, entity_type et2
WHERE s.sys_delete_dt IS NULL
and s.relto_id = e.entity_id
and e.entity_type_id = et.entity_type_id
and et.entity_type <> 'SEARCH'
and s.entity_id = e2.entity_id
and e2.entity_type_id = et2.entity_type_id
and et2.entity_type <> 'SEARCH'


Q3:
SELECT * from
sep_relations s where s.sys_delete_dt IS NULL and relto_id in
(select e.entity_id from entity e, entity_type et
where e.entity_type_id = et.entity_type_id
and et.entity_type <> 'SEARCH')
and entity_id in
(select e.entity_id from entity e, entity_type et
where e.entity_type_id = et.entity_type_id
and et.entity_type <> 'SEARCH')


Q4:
SELECT * from
(SELECT relation_id, e.entity_id, relto_id,
like_conf,rel_conf, disclosed, match_id, s.sys_delete_dt
FROM sep_relations s, entity e
where s.sys_delete_dt IS NULL
and s.relto_id = e.entity_id)a
inner join
(SELECT relation_id, e1.entity_id, relto_id,
like_conf,rel_conf, disclosed, match_id, s1.sys_delete_dt
FROM sep_relations s1, entity e1
where s1.sys_delete_dt IS NULL
and s1.entity_id = e1.entity_id)b
on a.entity_id=b.entity_id
where a.sys_delete_dt is null
and b.sys_delete_dt is null

Re: query problem [message #194662 is a reply to message #194626] Sat, 23 September 2006 12:50 Go to previous messageGo to next message
wagnerch
Messages: 58
Registered: July 2006
Member
It's a bit difficult to say without knowing the explain plans, data distribution, row counts, what has indexes and what doesn't, etc.

But using "s.sys_delete_dt IS NULL" isn't going to help your access path, it will only be used as a filter predicate. How many rows are in sep_relations? How many rows have sys_delete_dt set to NULL?

If you could run these queries, it would help understand the data distribution a bit more:

SELECT COUNT(*), COUNT(DISTINCT sys_delete_dt), COUNT (sys_delete_dt), COUNT(DISTINCT relto_id), COUNT(DISTINCT entity_id)
FROM sep_relations;

SELECT COUNT(*), COUNT(DISTINCT entity_id), COUNT(DISTINCT entity_type_id)
FROM entity;

SELECT COUNT(*), COUNT(DISTINCT entity_type_id), COUNT(DISTINCT entity_type)
FROM entity_type;


Also, do you have current statistics on all three tables (hint: DBMS_STATS.gather_table_stats) ?

If there is *few* rows that have sys_delete_dt set to NULL, then you could use a function based index with a histogram to map null and non-null values and provide a access path through sys_delete_dt.

Something along the lines of...

create index fidx_sep_relations on sep_relations (NVL2(sys_delete_dt, 'Y', 'N'));

begin
  dbms_stats.gather_table_stats(
    ownname=>USER,
    tabname=>'SEP_RELATIONS',
    estimate_percent=>null,
    method_opt=>'for all columns size auto');
end;
/

SELECT s.*
FROM sep_relations s, entity e, entity e2, entity_type et, entity_type et2
WHERE NVL2(s.sys_delete_dt,'Y','N') = 'N'
and s.relto_id = e.entity_id 
and e.entity_type_id = et.entity_type_id
and et.entity_type <> 'SEARCH'
and s.entity_id = e2.entity_id 
and e2.entity_type_id = et2.entity_type_id
and et2.entity_type <> 'SEARCH';


Of course it is hard to say it will improve anything, without knowing more.
Re: query problem [message #194675 is a reply to message #194626] Sat, 23 September 2006 22:58 Go to previous messageGo to next message
BlackSwan
Messages: 26766
Registered: January 2009
Location: SoCal
Senior Member
Remove entity e, entity e2, entity_type et, entity_type et2 out of the FROM clause because they contribute NO fields to the SELECT clause.
Since they are required for the correct solution set, subordinate them in the WHERE clause.
If you do so, a speed up 10 fold may results
Re: query problem [message #194678 is a reply to message #194662] Sat, 23 September 2006 23:16 Go to previous messageGo to next message
mjschwenger
Messages: 137
Registered: July 2005
Location: US
Senior Member
Of course, all the tables and indexes have their statistic updated. This is the data distribution:
On Test DB - at about 8.5 mln records in Sep_relation, 35% of them have Sys_delete_dt not null. The distinct Relto_id is over 8.2 mln and entity_id is about 8.3 mln. Entity table has close to 10 mln of records. There are 4 records in entity_type table which results in 2 distinct types in the entity_type into entity table.
Even if I remove completely Sys_delete_dt I still have a FTS on sep_relation although the biggest expected result set will never have more then 15 rows. It looks that the 2 Ids (which are pretty unique) are not enough limiting factors...
Please, explain the second post:
"If you do so, a speed up 10 fold may results "
I did try to put everything in a simple where and did not make any difference.
Thanks a lot, mj
Re: query problem [message #194699 is a reply to message #194626] Sun, 24 September 2006 08:34 Go to previous messageGo to next message
BlackSwan
Messages: 26766
Registered: January 2009
Location: SoCal
Senior Member
>WHERE s.sys_delete_dt IS NULL
Forces a FTS & precludes use of index, because index never contains NULL value(s).
Re: query problem [message #194703 is a reply to message #194699] Sun, 24 September 2006 09:42 Go to previous messageGo to next message
mjschwenger
Messages: 137
Registered: July 2005
Location: US
Senior Member
Yes, I already did that. Unfortunately, Sys_delete_dt is almost always Null - (up to 35% in test, I believe even less on Production is valued) and the FTS is still there. Here is the query and the plan still have FTS on sep_relation:
SELECT s.*
FROM sep_relations s, entity e, entity e2, entity_type et, entity_type et2
WHERE s.sys_delete_dt is null
and s.relto_id = e.entity_id
and e.entity_type_id = et.entity_type_id
and et.entity_type <> 'SEARCH'
and s.entity_id = e2.entity_id
and e2.entity_type_id = et2.entity_type_id
and et2.entity_type <> 'SEARCH';

Thanks a lot, mj
Re: query problem [message #194714 is a reply to message #194703] Sun, 24 September 2006 20:00 Go to previous messageGo to next message
rleishman
Messages: 3728
Registered: October 2005
Location: Melbourne, Australia
Senior Member
mjschwenger wrote on Sat, 23 September 2006 16:36

All I want is to select * from the table where relto_id = 10 or Entity_id=10 and the type from the third table is not 'Search'.


I don't see any predicate that limits the query to relto_id = 10. If it were there, and relto_id was indexed, then I expect it would use the index.

Ross Leishman
Re: query problem [message #194715 is a reply to message #194678] Sun, 24 September 2006 20:35 Go to previous messageGo to next message
wagnerch
Messages: 58
Registered: July 2006
Member
Quote:

Of course, all the tables and indexes have their statistic updated.


Well you never said that, plus you didn't post the indexes, explain plans or really anything more than the queries. If you posted this information then it would be much more valuable.

Quote:

Even if I remove completely Sys_delete_dt I still have a FTS on sep_relation although the biggest expected result set will never have more then 15 rows.


This is to be expected.


Quote:

It looks that the 2 Ids (which are pretty unique) are not enough limiting factors...



Why do you think the 2 id's would be a good limiting factor? Since you basically stated the distinct values for entity_type_id's is 4, the optimizer believes that it is going to select 2.5 million records (25%) of the entity table. Then it will join the 2.5 million records into a 8.5 million row table where it sounds like it is about a 1:1 (sep_relation to entity). So why would you even consider the other tables to be good limiting criteria?


You stated that production has even less null values for sys_delete_dt, why don't you give us the information against production and provide the data distributions (based on the queries above)? Perhaps the histogram and function based index will help you if there is very few records with sys_delete_dt. Other options may be bitmap indexes, but it really depends what your doing. Bitmap indexes will cripple you if it is a table with a high amount of DML.
Re: query problem [message #194716 is a reply to message #194715] Sun, 24 September 2006 23:07 Go to previous messageGo to next message
mjschwenger
Messages: 137
Registered: July 2005
Location: US
Senior Member
I cannot use the bitmap index as there are 4-5 million records per day that are inserted into this table and about as half are updated.

The incredible view (query1 above) was put in production only because the people could not make this query to run OK. Then they have decided to put this query as view, and to query it as select * from view where relto_id = 100 and entity_id=100 and sys_delete-dt not null.
I'll provide all indexes tomorrow when I get to work and plans as well. But this is a principle question which I do not think I could resolve in the way it is set right now... too bad...
At least I was able to remove the unnecessary UNION from the first query example.
Thanks a lot, mj
Re: query problem [message #194768 is a reply to message #194716] Mon, 25 September 2006 03:58 Go to previous messageGo to next message
wagnerch
Messages: 58
Registered: July 2006
Member
Quote:

I cannot use the bitmap index as there are 4-5 million records per day that are inserted into this table and about as half are updated.


Makes sense.


Quote:

The incredible view (query1 above) was put in production only because the people could not make this query to run OK. Then they have decided to put this query as view, and to query it as select * from view where relto_id = 100 and entity_id=100 and sys_delete-dt not null.


That doesn't make sense. Why would the view be a poor performer if they are using fairly unique keys to reference it? Also, if the view is doing sys_delete_dt IS NULL and the users are doing sys_delete_dt IS NOT NULL isn't the result 0 rows? If there is less than say 5% NULL values for sys_delete_dt then you could (again) use a function based index to map the NULL values into N's and the non-NULL values into Y's and via a histogram you could quickly refernece NULL-value rows -- but it seems pointless if the users are referencing values based on fairly unique keys.

Quote:


I'll provide all indexes tomorrow when I get to work and plans as well.


Please be sure to post explain plan's that represent how the users would use the view, not just "select * from view1".
Re: query problem [message #194877 is a reply to message #194768] Mon, 25 September 2006 19:05 Go to previous message
rleishman
Messages: 3728
Registered: October 2005
Location: Melbourne, Australia
Senior Member
Forget trying to index sys_delete-dt not null. It won't help.

You have three options (in order of descending performance):
  1. Create an index with ENTITY_ID and RELTO_ID as the leading columns. This index should be used (provided the two predicates are ANDed, not ORed - your posts below are contradictory).
  2. Have a separate index on each of ENTITY_ID and RELTO_ID. In v9i+ (I think, maybe 10g) Oracle is clever enough to perform bitmap conversions, using b-Tree indexes a lot like Bitmap indexes to resolve ORs and ANDs from multiple indexes. Even in prior versions, Oracle should be able to scan both indexes: you could use AND_EQUAL for ANDed predicates, or a USE_CONCAT hint for OR predicates.
  3. Have an index on either ENTITY_ID and RELTO_ID - Oracle will scan the indexed column and then filter the other one (once again, provided the two predicates are ANDed). Try an INDEX hint if Oracle refuses to use it - if its slower, you'll know why.


If the predicates are ORed and the plan is not showing a UNION, try the USE_CONCAT hint.

Ross Leishman
Previous Topic: Inefficient sql
Next Topic: ABOUT STATSPACK
Goto Forum:
  


Current Time: Wed Nov 27 05:54:44 CST 2024