#@#@#@!!!!!!

As technical consultant, I learn new stuff everyday. My goal here to collect those day-to-day findings and organized them for future reference.

#@#@#@!!!!!!

Saturday, July 01, 2006

What causing parser to choose table scan vs index scan

I'm trying to tune a query, and I am surprised by a relatively simple thing. I have a large (~1M rows) table ec_bug which has an index on ( id, snapshot_date ). You would think that it would be faster to get at the IDs by doing an INDEX (FAST FULL SCAN) than by looking at the table itself. In fact, that's what happens for this query:

> select _count(unique id)_ from j2se.ec_bug;

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=593 Card=1 Bytes=5)
1 0 SORT (GROUP BY)
2 1 _INDEX (FAST FULL SCAN)_ OF 'EC_ID_IX' (INDEX)
(Cost=593 Card=875305 Bytes=4376525)


But if I want the actual IDs instead of just the count, the optimizer switches to a full scan of the table itself, at a much

greater cost:

> select _unique id_ from j2se.ec_bug;

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=2262 Card=13054
Bytes=65270)
1 0 SORT (UNIQUE) (Cost=2262 Card=13054 Bytes=65270)
2 1 _TABLE ACCESS (FULL)_ OF 'EC_BUG_TBL' (TABLE)
(Cost=2128 Card=875305 Bytes=4376525)


This is especially surprising since the index is already sorted by ID, so the sort would incur no cost.

Multiple failures to use this index in a more complex query are causing it to time out; does anyone know what might be

causing the optimizer to ignore this index if I want the actual ID value?

Followup 1:

I believe the issue is NULLs. If you allow NULLs in the "id" column, you will
have to sample the whole table to find them. Any function, such as "count"
will ignore rows with NULLs. But select unique id will include the NULL.

For example:

SQL> create table x (id number, dt date);

Table created.

SQL> insert into x values (NULL,sysdate);

1 row created.

SQL> insert into x values (2, sysdate);

1 row created.

SQL> insert into x values (3, sysdate);

1 row created.

SQL> insert into x values (4, sysdate);

1 row created.

SQL> insert into x values (5, sysdate);

1 row created.

SQL> insert into x values (6, sysdate);

1 row created.

SQL> select count (unique id) from x;

COUNT(UNIQUEID)
---------------
5

SQL> select unique id from x;

ID
----------
2
3
4
5
6


6 rows selected.

So the count can use the index and ignore any NULL rows (which are not in
the index, I believe). But the direct select has to include the NULL so it
has to go against the table itself.


Followup 2:

Those fields were part of several indexes, but not the primary key, and weren't otherwise constrained to be null. Changing them to NOT NULL caused the index to be used.

Referance : oracle interest alias

KB : ORACLE_Db\TUNING\Table_scan_vs_index_scan_example1.txt

0 Comments:

Post a Comment

<< Home