Re: indexing null values curious case?
Date: Thu, 21 Nov 2019 12:38:52 +0000
Message-ID: <CACj1VR6mMGELY3x4CUP4pvUDRNZYXeHcD6s_vva-B2Ug_3YxqQ_at_mail.gmail.com>
> Another part I believe is that Oracle's statistics (including histogram
doesn't maintain any statistics regarding nulls, which makes sense in a
way), as these are only explicit cases and the concept of statistics
optimizer everything resolves only around the values that are present but
not null..
It’s quite the opposite, Oracle gives you a free frequency histogram for null values with the num_nulls column. It will use these statistics when relevant.
Thanks,
Andy
On Thu, 21 Nov 2019 at 12:18, Jonathan Lewis <jonathan_at_jlcomp.demon.co.uk> wrote:
> Vishnu,
>
> Your STUDENTS example looks like a bug. (The stats for the NULL and ''
> case are identical, by the way, so it's likely to be related to the way
> that Oracle stores the '' as the DEFAULT for the column definition).
>
> re:
> select * from temp where mark1 is null; --> it always goes with a full
> table scan
> It's got to; Oracle know that there are some rows with mark1 null, some
> with mark2 null, so it have to asume that the two sets of rows may overlap
> to give a row which does not appear in thei index.
>
> select * from temp where mark1 is null and mark2 = 123123;
> Now you're looking for a row which (if it exists) will be in the index.
> You could equally change the mark2 = 123123 to mark2 is not null.
>
> re:
> all i did was add an extra column to the index (virtual column).
> So now you have a column in the index which is NEVER null, so every row in
> the table will appear in the index, so the index can be used for mark1 is
> null
>
>
> re
> mark2 is null
> The possibly paths are
> index skip scan
> index full scan
> index fast full scan
> tablescan
>
> The choice of tablescan is almost certainly dictated by cost.
>
>
> BOTTOM LINE - your students case looks like a bug relating to the special
> case of the way Oracle handles the DEFAULT for a constant empty string. The
> rest are correct behaviour.
>
>
> Regards
> Jonathan Lewis
>
>
>
> ________________________________________
> From: oracle-l-bounce_at_freelists.org <oracle-l-bounce_at_freelists.org> on
> behalf of Vishnu Potukanuma <vishnupotukanuma_at_gmail.com>
> Sent: 21 November 2019 11:07
> To: Andy Sayer
> Cc: oracle-l_at_freelists.org
> Subject: Re: indexing null values curious case?
>
> Apparently looks like this entire section dealing with the null is not
> developed in its entirety.. the following is a different case as well
> regarding the nulls.
>
> even with a multi-column index, considering another scenario as the
> following:
> create table temp (roll number, name varchar2(20), mark1 number, mark2
> number);
> insert into temp select rownum, dbms_random.string(0,20),
> round(dbms_random.value(0,100), round(dbms_random.value(0,200)) from dual
> connect by level < 1000000;
> insert into temp (roll, name, mark2) select rownum,
> dbms_random.string(0,20), round(dbms_random.value(0,200) from dual connect
> by level < 10;
> insert into temp (roll, name, mark1) select rownum,
> dbms_random.string(0,20), round(dbms_random.value(0,200) from dual connect
> by level < 10;
> commit;
> create index idx on temp(mark1, mark2);
> Here i have only 10 rows from each of the columns as null;
>
> exec dbms_Stats.gather_table_stats('VISHNU','TEMP',CASCADE=>TRUE);
>
> from the leaf blocks we can find that the nulls are stored in the index.
> The leaf blocks containing the NULLs is as follows:
> row#0[8019] flag: -------, lock: 0, len=13
> col 0; len 2; (2): c1 02
> col 1; NULL
> col 2; len 6; (6): 01 c0 21 de 00 00
> row#1[7992] flag: -------, lock: 2, len=14
> col 0; len 3; (3): c2 02 04
> col 1; NULL
> col 2; len 6; (6): 01 c0 21 de 00 02
> row#2[8006] flag: -------, lock: 0, len=13
> col 0; NULL
> col 1; len 2; (2): c1 02
> col 2; len 6; (6): 01 c0 21 de 00 01
> row#3[7978] flag: -------, lock: 2, len=14
> col 0; NULL
> col 1; len 3; (3): c2 0d 20
> col 2; len 6; (6): 01 c0 21 de 00 03
>
> if we issue a query like select * from temp where mark1 is null; --> it
> always goes with a full table scan even when 10 rows only have null values
> and apparently the index leaf blocks do contain entries regarding the null
> values), for the leading as well as the tail columns as shown above. But
> the wierd thing is if we include both the columns mark1, mark2 in the
> predicate clause as following:
> select * from temp where mark1 is null and mark2 = 123123;
> the optimizer selects the index.
>
>
> now comes even better part.
> drop the index and create the index as follows:
> create index idx on temp(mark1,mark2,1);
> all i did was add an extra column to the index (virtual column).
> collected the statistics and ran the query back again.
> select * from temp where mark1 is null;
> this time it selects the index properly which makes me wonder if they have
> included or written the code or made the optimizer to consider the nulls in
> the predicate clause only in the presence of virtual columns.
>
> Another part I believe is that Oracle's statistics (including histogram
> doesn't maintain any statistics regarding nulls, which makes sense in a
> way), as these are only explicit cases and the concept of statistics
> optimizer everything resolves only around the values that are present but
> not null..
>
> also an another distinct behavior observed is this regarding of whether
> the index is
> create index idx on temp(mark1,mark2)
> or create index idx on temp(mark1,mark2,1);
>
> any query with predicates involving only mark2 regarding nulls goes with
> full table scan forget skip scans etc etc.
> queries like
> select * from temp where mark1 is not null and mark2 is null;
> select * from temp where mark2 is null;
> apparently some explanation for this can be the way oracle doesn't include
> nulls in the histograms.. and I am tired for the day!
>
> I can't make any assumptions as of which of the optimizer code has to be
> developed just with these observations, but the way oracle handles nulls
> should be improved consistently.
>
> can someone please tell me if i am missing any thing here? Oracle is a
> like a ocean. so many concepts no wonder i became bald at a very young age.
>
> Thanks,
> Vishnu
>
> On Thu, Nov 21, 2019 at 2:05 PM Vishnu Potukanuma <
> vishnupotukanuma_at_gmail.com<mailto:vishnupotukanuma_at_gmail.com>> wrote:
> This is reproducible case and probably hitting an undeveloped section of
> oracle code or a bug. Sure, let me see what I can do.
>
> Thanks,
> Vishnu
>
> On Thu, Nov 21, 2019 at 1:55 PM Andy Sayer <andysayer_at_gmail.com<mailto:
> andysayer_at_gmail.com>> wrote:
> Hi Vishnu
>
> File your reproducible test case with Oracle support if you suspect you’re
> hitting a bug.
>
> It is not expected for the existence of a special index to change the
> result of a query.
>
> If you want to find null values and the selectivity is decent for an index
> then I would use an index on (column_name,0).
>
> Thanks,
> Andy
>
> On Thu, 21 Nov 2019 at 08:18, Vishnu Potukanuma <
> vishnupotukanuma_at_gmail.com<mailto:vishnupotukanuma_at_gmail.com>> wrote:
> Hi,
>
> If all the columns in the index are null then the entries are not stored
> in the index. This is a known fact and empty strings are stored as null
> values. combining both breaks the consistency part where the query returns
> wrong results.
>
> this is really a worst case as no one creates such a index as
> (column_name,null) or (column_name,'') as this really doesn't make sense
> but this breaks the consistency aspect of database as the query gives
> wrong result as the optimizer consider index even when the leading column
> is null.
>
> the scenario is basically as following:
> Create table students (student_id number, name varchar2(20);
> insert into students select rownum, dbms_random.string(0,20) form dual
> connect by level < 1000000;
> insert into students (name) select dbms_random.string(0,20) from dual
> connect by level < 100;
> commit;
> create index idx on student(student_id,null);
> exec dbms_stats.gather_Table_stats('VISHNU','students',cascade=>True);
> select * from students where student_id is null;
> in this case it goes with the full table scan which is correct since both
> nulls are not stored.
>
> so now we drop the index idx;
> Here we create the index as this.
> create index idx on student(student_id,'');
> exec dbms_stats.gather_Table_stats('VISHNU','students',cascade=>True);
> select * from students where student_id is null;
> now Oracle retrives the results as 0, with the index scan.
>
> the leaf blocks store the details as following since the empty strings are
> treated as nulls
> row#0[8019] flag: -------, lock: 0, len=13
> col 0; len 2; (2): c1 02
> col 1; NULL
> col 2; len 6; (6): 01 c0 21 dc 00 00
> row#1[8006] flag: -------, lock: 2, len=13
> col 0; len 2; (2): c1 03
> col 1; NULL
> col 2; len 6; (6): 01 c0 21 dc 00 02
> but things as expected, index doesn't store entries when the leading
> column is null.
>
> here the oracle goes with the index scan and returns zero results. which
> is inconsistent result.
> optimizer trace considers that index with (student_id, '');
>
> Access Path: TableScan
> Cost: 21017.655599 Resp: 21017.655599 Degree: 0
> Cost_io: 20995.000000 Cost_cpu: 807173669
> Resp_io: 20995.000000 Resp_cpu: 807173669
> Access Path: index (index (FFS))
> Index: IDX
> resc_io: 1039.000000 resc_cpu: 237414695
> ix_sel: 0.000000 ix_sel_with_filters: 1.000000
> Access Path: index (FFS)
> Cost: 1045.663711 Resp: 1045.663711 Degree: 1
> Cost_io: 1039.000000 Cost_cpu: 237414695
> Resp_io: 1039.000000 Resp_cpu: 237414695
> ****** Costing Index IDX
> SPD: Return code in qosdDSDirSetup: NOCTX, estType = INDEX_SCAN
> SPD: Return code in qosdDSDirSetup: NOCTX, estType = INDEX_FILTER
> Access Path: index (IndexOnly)
> Index: IDX
> resc_io: 5.000000 resc_cpu: 235407
> ix_sel: 6.6556e-04 ix_sel_with_filters: 6.6556e-04
> Cost: 5.006607 Resp: 5.006607 Degree: 1
> Best:: AccessPath: IndexRange
> Index: IDX
> Cost: 5.006607 Degree: 1 Resp: 5.006607 Card: 999.000000
> Bytes: 0.000000
>
> probably we are hitting an undeveloped section of optimizer? producing a
> wrong plan is ok, but producing wrong results?
>
> Thanks,
> vishnu
> --
> http://www.freelists.org/webpage/oracle-l
>
>
>
-- http://www.freelists.org/webpage/oracle-lReceived on Thu Nov 21 2019 - 13:38:52 CET