Re: Functions and Relations

From: Aloha Kakuikanu <aloha.kakuikanu_at_yahoo.com>
Date: 21 Nov 2006 13:35:24 -0800
Message-ID: <1164144924.033155.307890_at_b28g2000cwb.googlegroups.com>


NENASHI, Tegiri wrote:
> "Aloha Kakuikanu" <aloha.kakuikanu_at_yahoo.com> wrote in
> news:1164136641.739367.248440_at_j44g2000cwa.googlegroups.com:
>
> > Aloha Kakuikanu wrote:
> >> Now returning to the main discussion. The query in question:
> >>
> >> select x,y from T
> >> where x=y and y=f(x)
> >>
> >> ....
> >
> > More realistic example:
> >
> > create table T (
> > x integer,
> > y integer,
> > );
> >
> > create index Ti on T(x,y);
> >
> > insert into t
> > select i/1000, mod(i,1000)
> > from Integers where i < 100000
> >
> > select * from T
> > where x = 1 and y = x+1
> >
> > Is your SQL database of choice able to find the access path where `x=1`
> > is evaluated first, then it is joined with `y = x+1` and finally a
> > tuple from T is fetched by index range scan?
>
> MS SQL Server is able to find the access path where x=1 is evaluated
> first. It seeks to the index location and then utilizes the function
> condition and reads the values from the index. I changed a little your
> SQL to read one row but it is the same exection plan:
>
> select * from T
> where x = 1 and y = x-1 -- '+' -> '-'
>
>
> Plan:
>
> |--Index Seek(OBJECT:([test].[dbo].[T].[Ti]),
> SEEK:([test].[dbo].[T].[x]=CONVERT_IMPLICIT(int,[_at_1],0)),
> WHERE:([test].[dbo].[T].[y]=([test].[dbo].[T].[x]-[_at_2])) ORDERED FORWARD)
>
> <RelOp AvgRowSize="15" EstimateCPU="0.0001581" EstimateIO="0.003125"
> EstimateRebinds="0" EstimateRewinds="0" EstimateRows="1" LogicalOp="Index
> Seek" NodeId="0" Parallel="false" PhysicalOp="Index Seek"
> EstimatedTotalSubtreeCost="0.0032831">
>
> <RunTimeInformation>
> <RunTimeCountersPerThread Thread="0" ActualRows="1"
> ActualEndOfScans="1" ActualExecutions="1" />
> </RunTimeInformation>

In other words, it doesn't rewrite the query to

select * from T
where x = 1 and y = 0

? BTW, I posted this challenge to the "big 3" forums, and Lennart posted qute a surprising fact (at least for me) that DB2 rewrites the query even when we have UDF:

create function my_f2 (x int)
 returns int
 language sql
 contains sql
 deterministic
 return case when mod(x,2)=0 then x*x+3 else -1*x end;

select *
from T
where x = 1 and y = my_f2(x)

(Of course, the solution is obvious from the "predicates are relations, and functions are predicates" perspective. Moreover, we can leverage the existing machinery of brute join orders evaluation without putting any burden on optimizer's query rewrite capabilities). Received on Tue Nov 21 2006 - 22:35:24 CET

Original text of this message