背景: 目的1: View Code
mylib.py View Code
include.py View Code
测试1: root@localhost : test 02:26:13>create table tt(id int auto_increment,name varchar(10),age int,address varchar(20),primary key (id))engine=innodb; Query OK, 0 rows affected (0.17 sec) root@zhoujy:/var/lib/mysql/test# ls -lh tt.ibd -rw-rw---- 1 mysql mysql 96K 2012-10-17 14:26 tt.ibd 查看ibd: root@zhoujy:/home/zhoujy/jiaoben/read_ibd# python py_innodb_page_info.py /var/lib/mysql/test/tt.ibd -v page offset 00000000, page type <File Space Header> page offset 00000001, page type <Insert Buffer Bitmap> page offset 00000002, page type <File Segment inode> page offset 00000003, page type <B-tree Node>, page level <0000> ---叶子节点 page offset 00000000, page type <Freshly Allocated Page> page offset 00000000, page type <Freshly Allocated Page> Total number of page: 6: Freshly Allocated Page: 2 Insert Buffer Bitmap: 1 File Space Header: 1 B-tree Node: 1 File Segment inode: 1 解释: root@localhost : test 02:42:58>insert into tt values(name,age,address) values('aaa',23,'HZZZ'); 疑惑:为什么没有申请区?区是64个连续的页,大小1M。那么表大小也应该是至少1M。但是现在只有96K(默认)。原因是因为每个段开始的时候,先有32个页大小的碎片页存放数据,使用 View Code
"额外"页:4个 root@zhoujy:/home/zhoujy/jiaoben/read_ibd# ls -lh /var/lib/mysql/test/tt.ibd -rw-rw---- 1 mysql mysql 2.0M 2012-10-17 16:10 /var/lib/mysql/test/tt.ibd root@zhoujy:/home/zhoujy/jiaoben/read_ibd# python py_innodb_page_info.py /var/lib/mysql/test/tt.ibd Total number of page: 128: Freshly Allocated Page: 91 Insert Buffer Bitmap: 1 File Space Header: 1 B-tree Node: 34 File Segment inode: 1 页从36跳到了128,因为已经用完了32个碎片页,新的页会采用区的方式进行空间申请。信息中看到有很多可用页,正好说明这点。 ▲溢出行数据存放:INNODB存储引擎是索引组织的,即每页中至少有两行记录,因此如果页中只能存放一行记录,INNODB会自动将行数据放到溢出页中。当发生溢出行的时候,实际数据保存在BLOB页中,数据页只保存数据的前768字节(老的文件格式),新的文件格式(Barracuda)采用完全行溢出的方式,数据页只保存20个字节的指针,BLOB也保存所有数据。如何查看表中有溢出行数据呢? root@localhost : test 04:52:34>create table t1 (id int,name varchar(10),memo varchar(8000))engine =innodb default charset utf8; Query OK, 0 rows affected (0.16 sec) root@localhost : test 04:53:10>insert into t1 values(1,'zjy',repeat('我',8000)); Query OK, 1 row affected (0.00 sec) 查看ibd: root@zhoujy:/home/zhoujy/jiaoben/read_ibd# python py_innodb_page_info.py /var/lib/mysql/test/t1.ibd -v page offset 00000000, page type <File Space Header> page offset 00000001, page type <Insert Buffer Bitmap> page offset 00000002, page type <File Segment inode> page offset 00000003, page type <B-tree Node>, page level <0000> page offset 00000004, page type <Uncompressed BLOB Page> page offset 00000005, page type <Uncompressed BLOB Page> Total number of page: 6: Insert Buffer Bitmap: 1 Uncompressed BLOB Page: 2 File Space Header: 1 B-tree Node: 1 File Segment inode: 1 从信息中看到,刚才插入的一行记录,已经溢出了,保存到了2个BLOB页中(<Uncompressed BLOB Page>)。因为1页只有16K,又要存2行数据,所以每行记录最好小于8K,而上面的远远大于8K,所以被溢出了。当然这个也不是包括特大字段,要是一张表里面有5个字段都是varchar(512)【多个varchar的总和大于8K就可以】,也会溢出: root@localhost : test 05:08:39>create table t2 (id int,name varchar(1000),address varchar(512),company varchar(200),xx varchar(512),memo varchar(512),dem varchar(1000))engine =innodb default charset utf8; Query OK, 0 rows affected (0.17 sec) root@localhost : test 05:08:43>insert into t2 values(1,repeat('周',1000),repeat('我',500),repeat('丁',500),repeat('啊',500),repeat('噢',500),repeat('阿a',500)); 1000+500+500+500+500+500=3500*3>8000字节;行会被溢出: root@zhoujy:/home/zhoujy/jiaoben/read_ibd# python py_innodb_page_info.py /var/lib/mysql/test/t2.ibd -v page offset 00000000, page type <File Space Header> page offset 00000001, page type <Insert Buffer Bitmap> page offset 00000002, page type <File Segment inode> page offset 00000003, page type <B-tree Node>, page level <0000> page offset 00000004, page type <Uncompressed BLOB Page> page offset 00000000, page type <Freshly Allocated Page> Total number of page: 6: Insert Buffer Bitmap: 1 Freshly Allocated Page: 1 File Segment inode: 1 B-tree Node: 1 File Space Header: 1 Uncompressed BLOB Page: 1 <Uncompressed BLOB Page> 页存放真正的数据,那数据页到底存放什么?用hexdump查看: root@zhoujy:/home/zhoujy/jiaoben/read_ibd# hexdump -C -v /var/lib/mysql/test/t1.ibd > t1.txt 查看ibd: View Code
文本中刚好是48行,每行16字节。48*16=768字节,刚好验证了之前说的:数据页只保存数据的前768字节(老的文件格式)。 总结1: |变长字段长度列表(1~2字节)|NULL标志位(1字节)|记录头信息(5字节)|RowID(6字节)|事务ID(6字节)|回滚指针(7字节)|
上面信息除了 "NULL标志位"[表中所有字段都定义为NOT NULL],"RowID"[表中有主键] ,"变长字段长度列表" [没有变长字段] 可能不存在外,其他信息都会出现。所以一行数据除了列数据所占用的字段外,还需要额外18字节。 mysql> create table mytest(t1 varchar(10),t2 varchar(10),t3 varchar(10) ,t4 varchar(10))engine=innodb charset = latin1 row_format=compact; Query OK, 0 rows affected (0.08 sec) mysql> insert into mytest values('a','bb','bb','ccc'); Query OK, 1 row affected (0.02 sec) mysql> insert into mytest values('a','ee','ee','fff'); Query OK, 1 row affected (0.01 sec) mysql> insert into mytest values('a',NULL,NULL,'fff'); Query OK, 1 row affected (0.00 sec) 测试数据准备完之后,执行shell命令: root@zhoujy:/usr/local/mysql/test# hexdump -C -v mytest.ibd > /home/zhoujy/mytest.txt 打开mytest.txt文件找到supremum这一行: 0000c070 73 75 70 72 65 6d 75 6d 03 02 02 01 00 00 00 10 |supremum........| ----------->一行,16字节 0000c080 00 25 00 00 00 03 b9 00 00 00 00 02 49 01 82 00 |.%..........I...| 0000c090 00 01 4a 01 10 61 62 62 62 62 63 63 63 03 02 02 |..J..abbbbccc...| 0000c0a0 01 00 00 00 18 00 23 00 00 00 03 b9 01 00 00 00 |......#.........| 0000c0b0 02 49 02 83 00 00 01 4b 01 10 61 65 65 65 65 66 |.I.....K..aeeeef| 0000c0c0 66 66 03 01 06 00 00 20 ff a6 00 00 00 03 b9 02 |ff..... ........| 0000c0d0 00 00 00 02 49 03 84 00 00 01 4c 01 10 61 66 66 |....I.....L..aff| 0000c0e0 66 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |f...............| 解释: 03 02 02 01 VS 03 01 ----------当值为NULL时,变长字段列表不会占用存储空间。 61 62 62 62 62 63 63 63 VS 61 66 66 66 --------- NULL值没有存储,不占空间 结论:当值为NULL时,变长字段列表不会占用存储空间。NULL值没有存储,不占空间,但是需要一个标志位(一行一个)。 mysql> create table mytest(t1 varchar(10) NOT NULL,t2 varchar(10) NOT NULL,t3 varchar(10) NOT NULL,t4 varchar(10) NOT NULL)engine=innodb charset = latin1 row_format=compact; Query OK, 0 rows affected (0.03 sec) mysql> insert into mytest values('a','bb','bb','ccc'); Query OK, 1 row affected (0.01 sec) mysql> insert into mytest values('a','ee','ee','fff'); Query OK, 1 row affected (0.01 sec) mysql> insert into mytest values('a',NULL,NULL,'fff'); ERROR 1048 (23000): Column 't2' cannot be null 步骤和上面一样,得到的ibd的结果是: 0000c070 73 75 70 72 65 6d 75 6d 03 02 02 01 00 00 10 00 |supremum........| 0000c080 24 00 00 00 03 b9 03 00 00 00 02 49 07 87 00 00 |$..........I....| 0000c090 01 4f 01 10 61 62 62 62 62 63 63 63 03 02 02 01 |.O..abbbbccc....| 0000c0a0 00 00 18 ff cb 00 00 00 03 b9 04 00 00 00 02 49 |...............I| 0000c0b0 08 88 00 00 01 50 01 10 61 65 65 65 65 66 66 66 |.....P..aeeeefff| 和上面比较,发现少了NULL的标志位信息。 mysql> create table mytest(t1 varchar(10) NOT NULL,t2 varchar(10) NOT NULL DEFAULT '',t3 varchar(10) NOT NULL ,t4 varchar(10))engine=innodb charset = latin1 row_format=compact; Query OK, 0 rows affected (0.02 sec) mysql> insert into mytest(t1,t2) values('A','BB'); Query OK, 1 row affected, 1 warning (0.01 sec) 步骤和上面一样,得到的ibd的结果是: 0000c070 73 75 70 72 65 6d 75 6d 00 02 01 01 00 00 10 ff |supremum........| 0000c080 ef 00 00 00 43 b9 03 00 00 00 02 4a 15 90 00 00 |....C......J....| 0000c090 01 c2 01 10 41 42 42 00 00 00 00 00 00 00 00 00 |....ABB.........| 和上面2个区别主要在于变长列表和列数据这里。 结论:列数据信息里表明了 NULL数据和''数据都不占用任何空间,对于变长字段列表的信息,和一对比得出:‘’数据虽然不需要占用任何存储空间,但是在变长字段列表里面还是需要占用一个字节<毕竟还是一个‘’值>,NULL值不需要占用”,只是NULL会有额外的一个标志位,所以能有个优化的说法:“数据库表中能设置NOT NULL的就尽量设置为NOT NULL,除非确实需要NULL值得。” 在此得到了证明。 上面的测试都是针对VARCHAR的变长类型,那对于CHAR呢? root@localhost : test 10:33:35>create table mytest(t1 char(10),t2 char(10),t3 char(10) ,t4 char(10))engine=innodb charset = latin1 row_format=compact;Query OK, 0 rows affected (0.16 sec) root@localhost : test 10:33:59>insert into mytest values('a','bb','bb','ccc'); Query OK, 1 row affected (0.00 sec) root@localhost : test 10:34:09>insert into mytest values('a','ee','ee','fff'); Query OK, 1 row affected (0.00 sec) root@localhost : test 10:34:19>insert into mytest values('a',NULL,NULL,'fff'); Query OK, 1 row affected (0.00 sec) 打开ibd生成的文件: 0000c060 02 00 1b 69 6e 66 69 6d 75 6d 00 04 00 0b 00 00 |...infimum......| 0000c070 73 75 70 72 65 6d 75 6d 00 00 00 10 00 41 00 00 |supremum.....A..| 0000c080 00 0a f5 00 00 00 00 81 2d 07 80 00 00 00 32 01 |........-.....2.| 0000c090 10 61 20 20 20 20 20 20 20 20 20 62 62 20 20 20 |.a bb | 0000c0a0 20 20 20 20 20 62 62 20 20 20 20 20 20 20 20 63 | bb c| 0000c0b0 63 63 20 20 20 20 20 20 20 00 00 00 18 00 41 00 |cc .....A.| 0000c0c0 00 00 0a f5 01 00 00 00 81 2d 08 80 00 00 00 32 |.........-.....2| 0000c0d0 01 10 61 20 20 20 20 20 20 20 20 20 65 65 20 20 |..a ee | 0000c0e0 20 20 20 20 20 20 65 65 20 20 20 20 20 20 20 20 | ee | 0000c0f0 66 66 66 20 20 20 20 20 20 20 06 00 00 20 ff 70 |fff ... .p| 0000c100 00 00 00 0a f5 02 00 00 00 81 2d 09 80 00 00 00 |..........-.....| 0000c110 32 01 10 61 20 20 20 20 20 20 20 20 20 66 66 66 |2..a fff| 0000c120 20 20 20 20 20 20 20 00 00 00 00 00 00 00 00 00 | .........| 和一的varchar比较发现:少了变长字段列表,但是对于char来讲,需要固定长度来存储的,存不到固定长度,也会被填充满。如:20;并且NULL值也不需要占用存储空间。 root@localhost : test 11:21:48>create table mytest(t1 int,t2 char(10),t3 varchar(10) ,t4 char(10))engine=innodb charset = latin1 row_format=compact; Query OK, 0 rows affected (0.17 sec) root@localhost : test 11:21:50>insert into mytest values(1,'a','b','c'); Query OK, 1 row affected (0.00 sec) root@localhost : test 11:22:06>insert into mytest values(11,'aa','bb','cc'); Query OK, 1 row affected (0.00 sec) 从上面的表结构中看出: 0000c070 73 75 70 72 65 6d 75 6d 01 00 00 00 10 00 33 00 |supremum......3.| 0000c080 00 00 0a f5 07 00 00 00 81 2d 1a 80 00 00 00 32 |.........-.....2| 0000c090 01 10 80 00 00 01 61 20 20 20 20 20 20 20 20 20 |......a | 0000c0a0 62 63 20 20 20 20 20 20 20 20 20 02 00 00 00 18 |bc .....| 0000c0b0 ff be 00 00 00 0a f5 08 00 00 00 81 2d 1b 80 00 |............-...| 0000c0c0 00 00 32 01 10 80 00 00 0b 61 61 20 20 20 20 20 |..2......aa | 0000c0d0 20 20 20 62 62 63 63 20 20 20 20 20 20 20 20 00 | bbcc .| 从上信息得出和之前预料的一样:因为表中只有一个varchar字段,所以,变长列表长度就只有:01 上面都是latin1单字节字符集的说明,那对于多字节字符集的情况怎么样? root@localhost : test 11:52:10>create table mytest(id int auto_increment,t2 varchar(10),t3 varchar(10) ,t4 char(10),primary key(id))engine=innodb charset = utf8 row_format=compact; Query OK, 0 rows affected (0.17 sec) root@localhost : test 11:52:11>insert into mytest(t2,t3,t4) values('bb','bb','ccc'); Query OK, 1 row affected (0.00 sec) root@localhost : test 11:55:34>insert into mytest(t2,t3,t4) values('我们','他们','我们的'); Query OK, 1 row affected (0.00 sec) ibd信息如下: 0000c070 73 75 70 72 65 6d 75 6d 0a 02 02 00 00 00 10 00 |supremum........| 0000c080 28 80 00 00 01 00 00 00 81 2d 27 80 00 00 00 32 |(........-'....2| 0000c090 01 10 62 62 62 62 63 63 63 20 20 20 20 20 20 20 |..bbbbccc | 0000c0a0 0a 06 06 00 00 00 18 ff c7 80 00 00 02 00 00 00 |................| 0000c0b0 81 2d 28 80 00 00 00 32 01 10 e6 88 91 e4 bb ac |.-(....2........| 0000c0c0 e4 bb 96 e4 bb ac e6 88 91 e4 bb ac e7 9a 84 20 |............... | 因为表有了主键,所以ROWID(6字节)不见了。 总结2: |
|