一、前言1、工作过程中,经常遇到需要造数的情况,比如为了测试接口性能,需要往数据库插入大量数据,这种情况下,如果仅靠普通的方式插入而不进行优化的话,太慢,太耗时,不能忍 2、网上给出的优化方式大多类似,但是并没有进行实测验证,为了验证这些方式是否有效,且给出真正实用的优化方案,本文将对千万级数据插入进行实测,为了提高测试准确性,会对不同的数据量进行多轮测试,每轮测3次取平均值,测试结果应该具有一定的参考性,希望对大家有帮助 二、文章概览1、测试环境说明,表结构 三、准备工作1、测试环境说明:
2、表结构:drop table if exists user; create table user( id int not null auto_increment comment '主键', username varchar(255) not null comment '用户名', password varchar(255) not null comment '密码', password_salt varchar(255) not null comment '密码随机盐值', nickname varchar(255) not null comment '昵称', user_no int default 0 not null comment '用户编码', ip varchar(255) comment 'IP地址', mobile varchar(11) comment '手机号', mail varchar(255) comment '邮箱', gender int default 0 not null comment '性别(0:男,1:女)', type int default 0 not null comment '类型(0:普通用户,1:超级管理员)', status int default 0 not null comment '状态(0:正常,1:黑名单,2:已注销)', is_deleted int default 0 not null comment '是否删除(0:有效,1:无效删除)', created_time datetime default now() not null comment '创建时间', updated_time datetime default now() not null comment '更新时间', primary key(id) ) comment = '用户表'; create unique index unq_idx_user_username on user(username); 四、测试(不修改mysql参数情况下)1、每次插入一条数据1.1 伪代码: insert into user(username, password) values('aaa', '123'); insert into user(username, password) values('bbb', '456'); insert into user(username, password) values('ccc', '789'); 1.2 存储过程: # 1、每次插入一条数据 drop procedure if exists insertIntoUser; delimiter $$ create procedure insertIntoUser(in num int) begin declare i int default 0; while i < num do set i = i + 1; set @username = concat('beigua', LPAD(i, 9, 0)); set @nickname = concat('北瓜', LPAD(i, 9, 0)); set @password = replace(uuid(), "-", ""); set @password_salt = replace(uuid(), "-", ""); set @user_no = i; INSERT INTO user(username, password, password_salt, nickname, user_no, ip, mobile, mail, gender, type, status, is_deleted, created_time, updated_time) VALUES (@username, @password, @password_salt, @nickname, @user_no, '192.168.1.1', '18888888888', '18888888888@163.com', '0', '0', '0', '0', now(), now()); end while; end $$ 1.3 调用存储过程,进行测试: truncate user; call insertIntoUser(100000); truncate user; call insertIntoUser(300000); 1.4 测试结果: image.png 可以看到,插入非常慢,10w数据188s左右,30w数据563s左右 2、每次插入多条数据2.1 伪代码: insert into user(username, password) values ('aaa', '123'), ('bbb', '456'), ('ccc', '789'); 2.2 存储过程: # 2、每次插入多条数据 drop procedure if exists insertIntoUserBatch; delimiter $$ # 创建存储过程,num表示插入的总行数,batch表示每次插入的行数 create procedure insertIntoUserBatch(in num int, in batchNum int) begin declare i int default 0; set @insert_value = ''; set @count = 0; set @batch_count = 0; while @count < num do # 内层while循环用于拼接insert into user(username, password) values('aaa', '123'), ('bbb', '456'), ('ccc', '789')...语句中values后面的部分 while (@batch_count < batchNum and @count < num) do set i = i + 1; set @username = concat('beigua', LPAD(i, 9, 0)); set @nickname = concat('北瓜', LPAD(i, 9, 0)); set @password = replace(uuid(), "-", ""); set @password_salt = replace(uuid(), "-", ""); set @user_no = i; if @batch_count > 0 then set @insert_value = concat(@insert_value, ','); end if; set @insert_value = concat(@insert_value, "(" , "'", @username, "'" , ",'", @password, "'" , ",'", @password_salt, "'" , ",'", @nickname, "'" , ",'", @user_no, "'" , ",'192.168.1.1'" , ",'18888888888'" , ",'18888888888@163.com'" , ",0" , ",0" , ",0" , ",0" , ",'", now(), "'" , ",'", now(), "'" , ")" ); set @batch_count = @batch_count + 1; end while; set @count = @count + @batch_count; # 拼接SQL语句并执行 set @exesql = concat("insert into user(username, password, password_salt, nickname, user_no, ip, mobile, mail, gender, type, status, is_deleted, created_time, updated_time) values ", @insert_value); prepare stmt from @exesql; execute stmt; deallocate prepare stmt; # 重置变量值 set @insert_value = ''; set @batch_count = 0; end while; # 数据插入完成后,查看表中总记录数 select count(id) from user; end $$ 2.3 测试结果: image.png 2.4 分析
3、批量提交事务3.1 伪代码: set autocommit = 0; insert into user(username, password) values('aaa', '123'); insert into user(username, password) values('bbb', '456'); insert into user(username, password) values('ccc', '789'); ... if 1000 commit; 3.2 存储过程: # 3、批量提交事务 drop procedure if exists insertIntoUser; delimiter $$ create procedure insertIntoUser(in num int, in batchNum int) begin declare i int default 0; while i < num do set i = i + 1; set @username = concat('beigua', LPAD(i, 9, 0)); set @nickname = concat('北瓜', LPAD(i, 9, 0)); set @password = replace(uuid(), "-", ""); set @password_salt = replace(uuid(), "-", ""); set @user_no = i; set autocommit = 0; INSERT INTO user(username, password, password_salt, nickname, user_no, ip, mobile, mail, gender, type, status, is_deleted, created_time, updated_time) VALUES (@username, @password, @password_salt, @nickname, @user_no, '192.168.1.1', '18888888888', '18888888888@163.com', '0', '0', '0', '0', now(), now()); if i mod batchNum = 0 then commit; end if; end while; end $$ 3.3 测试结果: image.png 3.4 分析
4、一次性提交所有事务4.1 伪代码: set autocommit = 0; insert into user(username, password) values('aaa', '123'); insert into user(username, password) values('bbb', '456'); insert into user(username, password) values('ccc', '789'); ... commit; 4.2 存储过程: # 4、一次性提交事务 drop procedure if exists insertIntoUser; delimiter $$ create procedure insertIntoUser(in num int) begin declare i int default 0; set autocommit = 0; while i < num do set i = i + 1; set @username = concat('beigua', LPAD(i, 9, 0)); set @nickname = concat('北瓜', LPAD(i, 9, 0)); set @password = replace(uuid(), "-", ""); set @password_salt = replace(uuid(), "-", ""); set @user_no = i; INSERT INTO user(username, password, password_salt, nickname, user_no, ip, mobile, mail, gender, type, status, is_deleted, created_time, updated_time) VALUES (@username, @password, @password_salt, @nickname, @user_no, '192.168.1.1', '18888888888', '18888888888@163.com', '0', '0', '0', '0', now(), now()); end while; commit; end $$ 4.3 测试结果: image.png 4.4 分析
5、数据插入前加索引与数据插入后加索引对比5.1 说明:索引可以提高查询效率,但同时也要承担维护索引(创建、更新)耗时的代价,通常情况下,建表时就创建好索引了,但是随着数据量的增大,如果每插入一条数据就更新一次索引,势必会影响插入效率,这里通过测试验证一下数据插入前加索引与数据插入后加索引对于数据插入的性能影响 5.2 为了节约时间,采用 create index idx_user_password on user(password); create index idx_user_password_salt on user(password_salt); create index idx_user_nickname on user(nickname); create index idx_user_user_no on user(user_no); 5.3 测试结果: image.png 5.4 分析
五、测试(修改mysql参数情况下)目前为止,都是在不修改mysql参数的情况下进行的性能优化,而如果条件允许,也可以通过修改mysql参数提升插入效率 这里主要是2组参数:
分别对以上2组参数进行测试,注意,这里都采用 1、第一组参数: |
|
来自: hongjing_z > 《Mysql》