create index idx_tab_4 on table_4(a)
运行以下查询
select /*b*/ count(b)
from table_1 t1
where a in (select a from table_4)
call count cpu elapsed disk query current rows
------- ------ ------------------ ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 2 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1 0.20 0.20 0 1811 0 1
------- ------ ------------------ ---------- ---------- ---------- ----------
total 3 0.20 0.20 0 1813 0 1
Misses in library cacheduring parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 92
Rows Row Source Operation
------- ---------------------------------------------------
1 SORT AGGREGATE (cr=1811 pr=0 pw=0 time=0 us)
2000 HASH JOIN RIGHT SEMI (cr=1811 pr=0 pw=0time=41 us cost=509 size=60000 card=2000)
2000 INDEX FAST FULL SCAN IDX_TAB_4 (cr=10 pr=0pw=0 time=31 us cost=4 size=26000 card=2000)(object id 80315)
500000 TABLE ACCESS FULL TABLE_1 (cr=1801 pr=0pw=0 time=7555 us cost=502 size=8500000 card=500000)
在这样的数据特征下,即使事先合并了副数据集成一张表,再进行半连接。测试结果表明,这样做并没有没有多大改进。
主数据集的记录数远大于副数据集的半关连
create table table_5
as select
cast(rownum*2+1 as int) a,
cast(rownum*2+10 as int) b,
cast(dbms_random.string('i',10) as varchar2(10)) c
from dual connect by level<=10
explain plan for
select count(b)
from table_1
where exists(select null from table_3 where a=table_1.a);
--------------------------------------------------------------------------------------------
| Id |Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------
| 0 | SELECTSTATEMENT | | 1 | 30 | 19 (6)| 00:00:01|
| 1 | SORT AGGREGATE | | 1 | 30 | | |
| 2| NESTED LOOPS | | | | | |
| 3 | NESTED LOOPS | | 10 | 300 | 19 (6)| 00:00:01 |
| 4 | SORT UNIQUE | | 10 | 130 | 3 (0)| 00:00:01 |
| 5 | TABLE ACCESS FULL | TABLE_3 | 10 | 130 | 3 (0)| 00:00:01 |
|* 6 | INDEX RANGE SCAN | IDX_TAB1_A | 1 | | 2 (0)| 00:00:01|
| 7 | TABLE ACCESS BY INDEX ROWID| TABLE_1 | 1 | 17 | 3 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------
结果看到,这时候执行计划中,没有出现join semi半关连,在这种特殊情况下,转化为nested loop。在这里用上了table_1的a列上的索引了。
副数据集来自于集合的半关连
explain plan for
with temp_table_2 as
(select /*+cardinality(10)*/ to_number(column_value) a fromtable(str_to_table('1,2,3,4,5,6',',')))
select c
from table_1
where exists (select null fromtemp_table_2 where a= table_1.a) /*mmm*/
---------------------------------------------------------------------------------------------------
| Id |Operation |Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------------