配色: 字号:
mysql中innodb表中count
2016-09-05 | 阅:  转:  |  分享 
  
mysql中innodb表中count()优化

count()是用来统计数据表中所有记录的一个函数了,但在此函数在innodb中性能不怎么样了,下面我们来看看mysql中innodb表中count()优化,希望例子对各位有帮助.



起因:在innodb表上做count()统计实在是太慢了,因此想办法看能不能再快点.



现象:先来看几个测试案例,如下.



一、sbtest表上的测试.



showcreatetablesbtest\G

1.row

Table:sbtest

CreateTable:CREATETABLE`sbtest`(

`aid`bigint(20)unsignedNOTNULLauto_increment,

`id`int(10)unsignedNOTNULLdefault''0'',

`k`int(10)unsignedNOTNULLdefault''0'',

`c`char(120)NOTNULLdefault'''',

`pad`char(60)NOTNULLdefault'''',

PRIMARYKEY(`aid`),

KEY`k`(`k`),

KEY`id`(`id`)

)ENGINE=InnoDBAUTO_INCREMENT=1000001DEFAULTCHARSET=latin1

showindexfromsbtest;

+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+

|Table|Non_unique|Key_name|Seq_in_index|Column_name|Collation|Cardinality|Sub_part|Packed|Null|Index_type|Comment|--phpfensi.com

+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+

|sbtest|0|PRIMARY|1|aid|A|1000099|NULL|NULL||BTREE||

|sbtest|1|k|1|k|A|18|NULL|NULL||BTREE||

|sbtest|1|id|1|id|A|1000099|NULL|NULL||BTREE||

+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+

填充了100万条记录.



1、直接count()



explainSELECTCOUNT()FROMsbtest;

+----+-------------+--------+-------+---------------+---------+---------+------+---------+-------------+

|id|select_type|table|type|possible_keys|key|key_len|ref|rows|Extra|

+----+-------------+--------+-------+---------------+---------+---------+------+---------+-------------+

|1|SIMPLE|sbtest|index|NULL|PRIMARY|8|NULL|1000099|Usingindex|

+----+-------------+--------+-------+---------------+---------+---------+------+---------+-------------+

SELECTCOUNT()FROMsbtest;

+----------+

|COUNT()|

+----------+

|1000000|

+----------+

1rowinset(1.42sec)

可以看到,如果不加任何条件,那么优化器优先采用primarykey来进行扫描.



2、count()使用primarykey字段做条件.



explainSELECTCOUNT()FROMsbtestWHEREaid>=0;

+----+-------------+--------+-------+---------------+---------+---------+------+--------+--------------------------+

|id|select_type|table|type|possible_keys|key|key_len|ref|rows|Extra|

+----+-------------+--------+-------+---------------+---------+---------+------+--------+--------------------------+

|1|SIMPLE|sbtest|range|PRIMARY|PRIMARY|8|NULL|485600|Usingwhere;Usingindex|

+----+-------------+--------+-------+---------------+---------+---------+------+--------+--------------------------+

SELECTCOUNT()FROMsbtestWHEREaid>=0;

+----------+

|COUNT()|

+----------+

|1000000|

+----------+

1rowinset(1.39sec)

可以看到,尽管优化器认为只需要扫描485600条记录(其实是索引),比刚才少多了,但其实仍然要做全表(索引)扫描,因此耗时和第一种相当.



3、count()使用secondaryindex字段做条件



explainSELECTCOUNT()FROMsbtestWHEREid>=0;

+----+-------------+--------+-------+---------------+------+---------+------+--------+--------------------------+

|id|select_type|table|type|possible_keys|key|key_len|ref|rows|Extra|

+----+-------------+--------+-------+---------------+------+---------+------+--------+--------------------------+

|1|SIMPLE|sbtest|range|id|id|4|NULL|500049|Usingwhere;Usingindex|

+----+-------------+--------+-------+---------------+------+---------+------+--------+--------------------------+

SELECTCOUNT()FROMsbtestWHEREid>=0;

+----------+

|COUNT()|

+----------+

|1000000|

+----------+

1rowinset(0.43sec)

可以看到,采用这种方式查询会非常快,有人也许会问了,会不会是因为id字段的长度比aid字段的长度来的小,导致它扫描起来比较快呢?先不着急下结论,咱们来看看下面的测试例子.



二、sbtest1表上的测试



showcreatetablesbtest1\G

1.row

Table:sbtest1

CreateTable:CREATETABLE`sbtest1`(

`aid`int(10)unsignedNOTNULLAUTO_INCREMENT,

`id`bigint(20)unsignedNOTNULLDEFAULT''0'',

`k`int(10)unsignedNOTNULLDEFAULT''0'',

`c`char(120)NOTNULLDEFAULT'''',

`pad`char(60)NOTNULLDEFAULT'''',

PRIMARYKEY(`aid`),

KEY`k`(`k`),

KEY`id`(`id`)

)ENGINE=InnoDBAUTO_INCREMENT=1000001DEFAULTCHARSET=latin1

showindexfrwww.shanxiwang.netomsbtest1;

+---------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+

|Table|Non_unique|Key_name|Seq_in_index|Column_name|Collation|Cardinality|Sub_part|Packed|Null|Index_type|Comment|

+---------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+

|sbtest1|0|PRIMARY|1|aid|A|1000099|NULL|NULL||BTREE||

|sbtest1|1|k|1|k|A|18|NULL|NULL||BTREE||

|sbtest1|1|id|1|id|A|1000099|NULL|NULL||BTREE||

+---------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+

这个表里,把aid和id的字段长度调换了一下,也填充了1000万条记录.



1、直接count().



explainSELECTCOUNT()FROMsbtest1;

+----+-------------+---------+-------+---------------+---------+---------+------+---------+-------------+

|id|select_type|table|type|possible_keys|key|key_len|ref|rows|Extra|

+----+-------------+---------+-------+---------------+---------+---------+------+---------+-------------+

|1|SIMPLE|sbtest1|index|NULL|PRIMARY|4|NULL|1000099|Usingindex|

+----+-------------+---------+-------+---------------+---------+---------+------+---------+-------------+

SELECTCOUNT()FROMsbtest1;

+----------+

|COUNT()|

+----------+

|1000000|

+----------+

1rowinset(1.42sec)

可以看到,如果不加任何条件,那么优化器优先采用primarykey来进行扫描.



2、count()使用primarykey字段做条件.



explainSELECTCOUNT()FROMsbtest1WHEREaid>=0;

+----+-------------+---------+-------+---------------+---------+---------+------+--------+--------------------------+

|id|select_type|table|type|possible_keys|key|key_len|ref|rows|Extra|

+----+-------------+---------+-------+---------------+---------+---------+------+--------+--------------------------+

|1|SIMPLE|sbtest1|range|PRIMARY|PRIMARY|4|NULL|316200|Usingwhere;Usingindex|

+----+-------------+---------+-------+---------------+---------+---------+------+--------+--------------------------+

1rowinset(0.00sec)

SELECTCOUNT()FROMsbtest1WHEREaid>=0;

+----------+

|COUNT()|

+----------+

|1000000|

+----------+

1rowinset(1.42sec)

可以看到,尽管优化器认为只需要扫描485600条记录(其实是索引),比刚才少多了,但其实仍然要做全表(索引)扫描,因此耗时和第一种相当.



3、count()使用secondaryindex字段做条件.



explainSELECTCOUNT()FROMsbtest1WHEREid>=0;

+----+-------------+---------+-------+---------------+------+---------+------+--------+--------------------------+

|id|select_type|table|type|possible_keys|key|key_len|ref|rows|Extra|

+----+-------------+---------+-------+---------------+------+---------+------+--------+--------------------------+

|1|SIMPLE|sbtest1|range|id|id|8|NULL|500049|Usingwhere;Usingindex|

+----+-------------+---------+-------+---------------+------+---------+------+--------+--------------------------+

1rowinset(0.00sec)

SELECTCOUNT()FROMsbtest1WHEREid>=0;

+----------+

|COUNT()|

+----------+

|1000000|

+----------+

1rowinset(0.45sec)

可以看到,采用这种方式查询会非常快,上面的所有测试,均在mysql5.1.24环境下通过,并且每次查询前都重启了mysqld.



可以看到,把aid和id的长度调换之后,采用secondaryindex查询仍然是要比用primarykey查询来的快很多。看来主要不是字段长度引起的索引扫描快慢,而是采用primarykey以及secondaryindex引起的区别,那么,为什么用secondaryindex扫描反而比primarykey扫描来的要快呢?我们就需要了解innodb的clusteredindex和secondaryindex之间的区别了.



innodb的clusteredindex是把primarykey以及rowdata保存在一起的,而secondaryindex则是单独存放,然后有个指针指向primarykey,因此,需要进行count()统计表记录总数时,利用secondaryindex扫描起来,显然更快,而primarykey则主要在扫描索引,同时要返回结果记录时的作用较大,例如:



SELECTFROMsbtestWHEREaid=xxx;



那既然是使用secondaryindex会比primarykey更快,为何优化器却优先选择primarykey来扫描呢,HeikkiTuuri的回答是:



intheexampletable,thesecondaryindexisinsertedintoinaperfectorder!Thatis

veryunusual.Normallythesecondaryindexwouldbefragmented,causingrandomdiskI/O,

andthescanwouldbeslowerthanintheprimaryindex.

Iamchangingthistoafeaturerequest:keep''clusteringratio''statisticsonasecondary

indexanddothescanthereiftheorderisalmostthesameasintheprimaryindex.I

doubtthisfeaturewilleverbeimplemented,though.

献花(0)
+1
(本文系网络学习天...首藏)