oracle中not exists对外层查询的影响详解
前言
最近同事发现了一个问题,在12c中跑的bufferget很高,但是在10g中跑的buffer很低。怀疑是不是12c的优化器有问题。
这个10g的环境和12c的环境,数据量大致一样,只是有很少部分的不同,但是就是这个很少部分不同,造成了notexists中的子查询返回不同的值,进而对外层查询产生不同的影响。
我们来用如下的代码模拟一下。
初始化数据:
--10g droptablet1; droptablet2; createtablet1(idnumber,namevarchar2(20),dep_idvarchar2(10)); createtablet2(idnumber,namevarchar2(20),dep_idvarchar2(10)); insertintot1selectrownum,'a','kk'fromdualconnectbylevel<=3000000; insertintot2selectrownum,'a','kk'fromdualconnectbylevel<=1000000; insertintot2selectrownum,'a','mm'fromdual; commit; --12c droptablet1; droptablet2; createtablet1(idnumber,namevarchar2(20),dep_idvarchar2(10)); createtablet2(idnumber,namevarchar2(20),dep_idvarchar2(10)); insertintot1selectrownum,'a','kk'fromdualconnectbylevel<=3000000; insertintot2selectrownum,'a','kk'fromdualconnectbylevel<=1000000; commit;
我们看到,12c的数据和10g只是有很少的差别,t1表12c和10g都一样,t2表在12c只是少了一行数据。
--10g SQL>selectdep_id,count(*)fromt1groupbydep_id; DEP_IDCOUNT(*) ------------------------------ kk3000000 SQL>selectdep_id,count(*)fromt2groupbydep_id; DEP_IDCOUNT(*) ------------------------------ mm1 kk1000000 SQL> --12c SQL>selectdep_id,count(*)fromt1groupbydep_id; DEP_IDCOUNT(*) ------------------------------ kk3000000 SQL>selectdep_id,count(*)fromt2groupbydep_id; DEP_IDCOUNT(*) ------------------------------ kk1000000 SQL>
我们将要执行的sql语句是:
selectcount(*) fromt1,t2 wheret1.id=t2.id andt1.dep_id='kk' andnotexists(select1 fromt1,t2 wheret1.id=t2.id andt2.dep_id='mm');
我们先来看执行情况的差距,10g的bufferget小,12c多:
--10g
SQL>select/*+gather_plan_statistics*/count(*)fromt1,t2wheret1.id=t2.idandt1.dep_id='kk'andnotexists(select1fromt1,t2wheret1.id=t2.idandt2.dep_id='mm');
COUNT(*)
----------
0
SQL>select*fromtable(dbms_xplan.display_cursor(null,null,'ALLSTATSLAST'));
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID22t5mb43w55pr,childnumber0
-------------------------------------
select/*+gather_plan_statistics*/count(*)fromt1,t2wheret1.id=t2.idandt1.dep_id='kk'andnot
exists(select1fromt1,t2wheret1.id=t2.idandt2.dep_id='mm')
Planhashvalue:3404612428
------------------------------------------------------------------------------------------------------------------
|Id|Operation|Name|Starts|E-Rows|A-Rows|A-Time|Buffers|OMem|1Mem|Used-Mem|
------------------------------------------------------------------------------------------------------------------
|0|SELECTSTATEMENT||1||1|00:00:00.02|2086||||
|1|SORTAGGREGATE||1|1|1|00:00:00.02|2086||||
|*2|FILTER||1||0|00:00:00.02|2086||||
|*3|HASHJOIN||0|901K|0|00:00:00.01|0|39M|5518K||
|4|TABLEACCESSFULL|T2|0|901K|0|00:00:00.01|0||||
|*5|TABLEACCESSFULL|T1|0|2555K|0|00:00:00.01|0||||
|*6|HASHJOIN||1|23|1|00:00:00.02|2086|1517K|1517K|612K(0)|
|*7|TABLEACCESSFULL|T2|1|23|1|00:00:00.02|2082||||
|8|TABLEACCESSFULL|T1|1|2555K|1|00:00:00.01|4||||
------------------------------------------------------------------------------------------------------------------
PredicateInformation(identifiedbyoperationid):
---------------------------------------------------
2-filter(ISNULL)
3-access("T1"."ID"="T2"."ID")
5-filter("T1"."DEP_ID"='kk')
6-access("T1"."ID"="T2"."ID")
7-filter("T2"."DEP_ID"='mm')
Note
-----
-dynamicsamplingusedforthisstatement
34rowsselected.
SQL>
--12c
SQL>select/*+gather_plan_statistics*/count(*)fromt1,t2wheret1.id=t2.idandt1.dep_id='kk'andnotexists(select1fromt1,t2wheret1.id=t2.idandt2.dep_id='mm');
COUNT(*)
----------
1000000
SQL>select*fromtable(dbms_xplan.display_cursor(null,null,'ALLSTATSLAST'));
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID22t5mb43w55pr,childnumber0
-------------------------------------
select/*+gather_plan_statistics*/count(*)fromt1,t2where
t1.id=t2.idandt1.dep_id='kk'andnotexists(select1fromt1,t2
wheret1.id=t2.idandt2.dep_id='mm')
Planhashvalue:1692274438
--------------------------------------------------------------------------------------------------------------------
|Id|Operation|Name|Starts|E-Rows|A-Rows|A-Time|Buffers|OMem|1Mem|Used-Mem|
--------------------------------------------------------------------------------------------------------------------
|0|SELECTSTATEMENT||1||1|00:00:00.79|10662||||
|1|SORTAGGREGATE||1|1|1|00:00:00.79|10662||||
|*2|FILTER||1||1000K|00:00:00.74|10662||||
|*3|HASHJOIN||1|1215K|1000K|00:00:00.52|8579|43M|6111K|42M(0)|
|4|TABLEACCESSFULL|T2|1|1215K|1000K|00:00:00.01|2083||||
|*5|TABLEACCESSFULL|T1|1|2738K|3000K|00:00:00.07|6496||||
|*6|HASHJOINRIGHTSEMI||1|35|0|00:00:00.02|2083|1245K|1245K|461K(0)|
|*7|TABLEACCESSFULL|T2|1|23|0|00:00:00.02|2083||||
|8|TABLEACCESSFULL|T1|0|2738K|0|00:00:00.01|0||||
--------------------------------------------------------------------------------------------------------------------
PredicateInformation(identifiedbyoperationid):
---------------------------------------------------
2-filter(ISNULL)
3-access("T1"."ID"="T2"."ID")
5-filter("T1"."DEP_ID"='kk')
6-access("T1"."ID"="T2"."ID")
7-filter("T2"."DEP_ID"='mm')
Note
-----
-dynamicstatisticsused:dynamicsampling(level=2)
35rowsselected.
SQL>
SQL>
可以看到第23,24行,在10g中运行时,buffers是0,而在12c中,即78,79行,buffer是2083+6496。
也就是说在10g中,外层查询不进行t1和t2的扫描,直接返回结果了,而在12c中,外层查询还要进行t1表和t2表层扫描才返回结果。
这其实不是10g和12c的差别,而是notexists的返回数据对外层的影响。子查询要返回0行记录,才满足notexist的条件,从而返回外层查询结果。
在10g中,子查询返回了一行记录
--10g SQL>select1fromt1,t2wheret1.id=t2.idandt2.dep_id='mm'; 1 ---------- 1 SQL>
不满足notexists(即0行才满足),所以,也就不用在外层继续查询了。直接返回记录0行。
在12c中,子查询返回0行记录,满足notexist的条件,所以还需要在外层查询中继续查询。
--12c
SQL>selectcount(*)fromt1,t2wheret1.id=t2.idandt2.dep_id='kk';
COUNT(*)
----------
1000000
SQL>setline1000
SQL>setpages1000
SQL>colPLAN_TABLE_OUTPUTfora250
SQL>
SQL>
SQL>select/*+gather_plan_statistics*/count(*)fromt1,t2wheret1.id=t2.idandt1.dep_id='kk'andnotexists(select1fromt1,t2wheret1.id=t2.idandt2.dep_id='kk');
COUNT(*)
----------
0
SQL>select*fromtable(dbms_xplan.display_cursor(null,null,'ALLSTATSLAST'));
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_IDc5hj2p2jt1fxf,childnumber0
-------------------------------------
select/*+gather_plan_statistics*/count(*)fromt1,t2where
t1.id=t2.idandt1.dep_id='kk'andnotexists(select1fromt1,t2
wheret1.id=t2.idandt2.dep_id='kk')
Planhashvalue:1692274438
--------------------------------------------------------------------------------------------------------------------
|Id|Operation|Name|Starts|E-Rows|A-Rows|A-Time|Buffers|OMem|1Mem|Used-Mem|
--------------------------------------------------------------------------------------------------------------------
|0|SELECTSTATEMENT||1||1|00:00:00.28|2087||||
|1|SORTAGGREGATE||1|1|1|00:00:00.28|2087||||
|*2|FILTER||1||0|00:00:00.28|2087||||
|*3|HASHJOIN||0|1215K|0|00:00:00.01|0|69M|7428K||
|4|TABLEACCESSFULL|T2|0|1215K|0|00:00:00.01|0||||
|*5|TABLEACCESSFULL|T1|0|2738K|0|00:00:00.01|0||||
|*6|HASHJOINRIGHTSEMI||1|2738K|1|00:00:00.28|2087|43M|6111K|42M(0)|
|*7|TABLEACCESSFULL|T2|1|1215K|1000K|00:00:00.12|2083||||
|8|TABLEACCESSFULL|T1|1|2738K|1|00:00:00.01|4||||
--------------------------------------------------------------------------------------------------------------------
PredicateInformation(identifiedbyoperationid):
---------------------------------------------------
2-filter(ISNULL)
3-access("T1"."ID"="T2"."ID")
5-filter("T1"."DEP_ID"='kk')
6-access("T1"."ID"="T2"."ID")
7-filter("T2"."DEP_ID"='kk')
Note
-----
-dynamicstatisticsused:dynamicsampling(level=2)
35rowsselected.
SQL>
可以看到第38,39行的buffer为0.
总结
以上就是这篇文章的全部内容了,希望本文的内容对大家的学习或者工作能带来一定的帮助,如果有疑问大家可以留言交流。