Saturday, December 29, 2012

DB2 select count测试总结


DB2 select count测试总结
测试表:
测试环境:Aix 6.1 DB2 v9.7 FP6
Tb1:
db2 "create table tb1(id int not null primary key, name varchar(30) not null, value int)"
db2 "create index idx_tb1_name on tb1(name)"
db2 "
insert into tb1(id, name, value)
with temp (counter, name, value)
as (
values (1, 'abcdefghijklmnopqrstuvwxyz', mod(int(rand() * 100000), 25))
union all
select (counter + 1), 'abcdefghijklmnopqrstuvwxyz', mod(int(rand() * 100000), 25) from temp where (counter + 1) < 1 + 100000
)
select counter, name, value from temp
"
db2 "
insert into tb1(id, name, value)
with temp (counter, name, value)
as (
values (100001, 'abc', mod(int(rand() * 100000), 25))
union all
select (counter + 1), 'abc', mod(int(rand() * 100000), 25) from temp where (counter + 1) < 1 + 200000
)
select counter, name, value from temp
"
db2 "
insert into tb1(id, name, value)
with temp (counter, name, value)
as (
values (200001, 'hellwen', mod(int(rand() * 100000), 25))
union all
select (counter + 1), 'hellwen', mod(int(rand() * 100000), 25) from temp where (counter + 1) < 1 + 300000
)
select counter, name, value from temp
"

说明:ID为主键、Name中有三个键,分别:'abcdefghijklmnopqrstuvwxyz', 'abc''hellwen'。数据各位10万。Value为随机数。

测试语句:
分别测试count(*)count为主键、where中字段、非where中字段(有索引和没有索引两种情况)
db2 "explain all for select count(*) from db2iabc.tb1"
db2 "explain all for select count(name) from db2iabc.tb1"
db2 "explain all for select count(*) from db2iabc.tb1 where name = 'abc'"
db2 "explain all for select count(id) from db2iabc.tb1 where name = 'abc'"
db2 "explain all for select count(name) from db2iabc.tb1 where name = 'abc'"
db2 "explain all for select count(value) from db2iabc.tb1 where name = 'abc'"
# 查询执行计划
db2exfmt -d abc_db -1 –t

测试情况:

更新统计信息:
db2 runstats on table db2iabc.tb1 with distribution and sampled detailed indexes all

  1. count(*)where
因为没有where条件,所以DB2直接扫描主键,对rid进行groupby后得到结果。

db2 "explain all for select count(*) from db2iabc.tb1"

Original Statement:
------------------
select count(*)
from db2iabc.tb1


Optimized Statement:
-------------------
SELECT Q3.$C0
FROM
(SELECT COUNT(*)
FROM
(SELECT $RID$
FROM DB2IABC.TB1 AS Q1) AS Q2) AS Q3

Access Plan:
-----------
Total Cost: 2249.44
Query Degree: 1

Rows
RETURN
( 1)
Cost
I/O
|
1
GRPBY
( 2)
2249.44
2013.45
|
299999
IXSCAN
( 3)
2228.19
2013.45
|
299999
INDEX: SYSIBM
SQL121226162132260
Q1


  1. count(name)where
执行计划同上,因为对全表进行count,所以使用哪个字段都一样。

db2 "explain all for select count(name) from db2iabc.tb1"

Original Statement:
------------------
select count(name)
from db2iabc.tb1


Optimized Statement:
-------------------
SELECT Q3.$C0
FROM
(SELECT COUNT(*)
FROM
(SELECT $RID$
FROM DB2IABC.TB1 AS Q1) AS Q2) AS Q3

Access Plan:
-----------
Total Cost: 2249.44
Query Degree: 1

Rows
RETURN
( 1)
Cost
I/O
|
1
GRPBY
( 2)
2249.44
2013.45
|
299999
IXSCAN
( 3)
2228.19
2013.45
|
299999
INDEX: SYSIBM
SQL121226162132260
Q1


  1. count(*)where
从下面执行计划中可以看出db2sql语句进行了重写,因为name字段有索引,而且我们知道每个非主键索引中都包含rid,所以重写后db2通过name字段的索引查询rid,然后对rid进行groupby后就可以得到结果。是一个index only的操作。

db2 "explain all for select count(*) from db2iabc.tb1 where name = 'abc'"

Original Statement:
------------------
select count(*)
from db2iabc.tb1
where name = 'abc'


Optimized Statement:
-------------------
SELECT Q3.$C0
FROM
(SELECT COUNT(*)
FROM
(SELECT $RID$
FROM DB2IABC.TB1 AS Q1
WHERE (Q1."NAME" = 'abc')) AS Q2) AS Q3

Access Plan:
-----------
Total Cost: 1815.68
Query Degree: 1

Rows
RETURN
( 1)
Cost
I/O
|
1
GRPBY
( 2)
1815.68
232.665
|
99999
IXSCAN
( 3)
1808.59
232.665
|
299999
INDEX: DB2IABC
IDX_TB1_NAME
Q1

  1. count(主键)where
采用count主键的方式情况同上,因为主键包含在name索引中,所以可以直接查询索引得到count结果。

db2 "explain all for select count(id) from db2iabc.tb1 where name = 'abc'"

Original Statement:
------------------
select count(id)
from db2iabc.tb1
where name = 'abc'


Optimized Statement:
-------------------
SELECT Q3.$C0
FROM
(SELECT COUNT(*)
FROM
(SELECT $RID$
FROM DB2IABC.TB1 AS Q1
WHERE (Q1."NAME" = 'abc')) AS Q2) AS Q3

Access Plan:
-----------
Total Cost: 1815.7
Query Degree: 1

Rows
RETURN
( 1)
Cost
I/O
|
1
GRPBY
( 2)
1815.7
232.667
|
100000
IXSCAN
( 3)
1808.61
232.667
|
299999
INDEX: DB2IABC
IDX_TB1_NAME
Q1

  1. countwhere字段)
大家可以看Optimized Statement:DB2优化器重写)后的语句情况同上,执行计划也完全一样。DB2直接判断出count(name)count(*)都是可以采用name字段的索引进行查询,结果都一样。所以将语句重写成count(*)

db2 "explain all for select count(name) from db2iabc.tb1 where name = 'abc'"

Original Statement:
------------------
select count(name)
from db2iabc.tb1
where name = 'abc'


Optimized Statement:
-------------------
SELECT Q3.$C0
FROM
(SELECT COUNT(*)
FROM
(SELECT $RID$
FROM DB2IABC.TB1 AS Q1
WHERE (Q1."NAME" = 'abc')) AS Q2) AS Q3

Access Plan:
-----------
Total Cost: 1815.7
Query Degree: 1

Rows
RETURN
( 1)
Cost
I/O
|
1
GRPBY
( 2)
1815.7
232.667
|
100000
IXSCAN
( 3)
1808.61
232.667
|
299999
INDEX: DB2IABC
IDX_TB1_NAME
Q1

  1. count(1)
情况同上,DB2直接查询索引对RID进行groupby操作。返回结果时替换成1

db2 "explain all for select count(1) from db2iabc.tb1 where name = 'abc'"

Original Statement:
------------------
select count(1)
from db2iabc.tb1
where name = 'abc'


Optimized Statement:
-------------------
SELECT Q3.$C0
FROM
(SELECT COUNT(1)
FROM
(SELECT $RID$
FROM DB2IABC.TB1 AS Q1
WHERE (Q1."NAME" = 'abc')) AS Q2) AS Q3

Access Plan:
-----------
Total Cost: 1815.68
Query Degree: 1

Rows
RETURN
( 1)
Cost
I/O
|
1
GRPBY
( 2)
1815.68
232.665
|
99999
IXSCAN
( 3)
1808.59
232.665
|
299999
INDEX: DB2IABC
IDX_TB1_NAME
Q1


  1. count(非where字段)
大家先看优化器重写后的SQL,这里优化器已经没办法将count(value)转换成count(*)。因为where中使用的索引无法获取value字段的值,db2必须进行表的查询才能取到value,所以执行计划也发生了很大的变化,Total Cost变得很高。

db2 "explain all for select count(value) from db2iabc.tb1 where name = 'abc'"

Original Statement:
------------------
select count(value)
from db2iabc.tb1
where name = 'abc'


Optimized Statement:
-------------------
SELECT Q3.$C0
FROM
(SELECT COUNT(Q2."VALUE")
FROM
(SELECT Q1."VALUE"
FROM DB2IABC.TB1 AS Q1
WHERE (Q1."NAME" = 'abc')) AS Q2) AS Q3

Access Plan:
-----------
Total Cost: 2765.49
Query Degree: 1

Rows
RETURN
( 1)
Cost
I/O
|
1
GRPBY
( 2)
2765.49
1107
|
100000
FETCH
( 3)
2758.41
1107
/---+----\
100000 299999
IXSCAN TABLE: DB2IABC
( 4) TB1
1808.61 Q1
232.667
|
299999
INDEX: DB2IABC
IDX_TB1_NAME
Q1

  1. count(where字段),添加count字段的索引
count的字段中增加了索引,查看执行计划可以知道该索引并没有效果。DB2无法通过两个索引关联进行查询。DB2中的索引只能进行IXANDIXOR

db2 "create index idx_tb1_value on tb1(value)"
db2 "explain all for select count(value) from db2iabc.tb1 where name = 'abc'"

Original Statement:
------------------
select count(value)
from db2iabc.tb1
where name = 'abc'


Optimized Statement:
-------------------
SELECT Q3.$C0
FROM
(SELECT COUNT(Q2."VALUE")
FROM
(SELECT Q1."VALUE"
FROM DB2IABC.TB1 AS Q1
WHERE (Q1."NAME" = 'abc')) AS Q2) AS Q3

Access Plan:
-----------
Total Cost: 2765.49
Query Degree: 1

Rows
RETURN
( 1)
Cost
I/O
|
1
GRPBY
( 2)
2765.49
1107
|
100000
FETCH
( 3)
2758.41
1107
/---+----\
100000 299999
IXSCAN TABLE: DB2IABC
( 4) TB1
1808.61 Q1
232.667
|
299999
INDEX: DB2IABC
IDX_TB1_NAME
Q1


总结:
  1. 对于目前的优化器已经能很好的处理count(*)
  2. count(1)count(*)count(主键) 效率都一样,所以count时建议使用这几种格式

No comments:

Post a Comment