MS SQL 索引(三)

greenyun588 2013-09-01

展开全文

碎片：当索引所在页面的基于主关键字的逻辑顺序，和数据文件中的物理顺序不匹配时，碎片就产生了。所有的叶级页包含了指向前一个和后一个页的指针。这样就形成一个双链表。理想情况下，数据文件中页的物理顺序会和逻辑顺序匹配。整个磁盘的工作性能在物理顺序匹配逻辑顺序时将显著提升。对某些特定的查询而言，这将带来极佳的性能。当物理排序和逻辑排序不匹配时，磁盘的工作性能会变得低效，这是因为磁头必须向前和向后移动来查找索引，而不是只象某个单一方向来搜索。碎片会影响I/O性能，不过对于位于SQL Server数据缓冲内的数据页而言，碎片并不会带来任何影响。当索引第一次创建时，没有或者只有极少碎片。随着时间推移，插入，更新和删除数据，和这些数据相关的索引上的碎片就增加了

在索引碎片整理前，请确保系统资源的一些问题，比如物理磁盘碎片，不合理的基础结构等因素会给性能带来负面影响：

系统资源问题：在索引碎片整理之前，要确认系统任何性能问题和系统资源限制无关。关于这方面的详细讨论已经超出了本文的范围，不过有些更常见的资源问题和I/O子系统性能，内存使用以及CPU使用率相关。关于分析这些类型资源问题的更深入讨论，请见本文最后的“更多的信息”章节。

物理磁盘碎片：在某些系统上，磁盘碎片会带来很糟的性能。要确定是否存在磁盘碎片，可以使用Microsoft Windows自带的系统工具，或者第三方提供的工具来分析SQL Server所在的分区。对于常规的I/O子系统上的规模较小的数据库，建议在运行索引碎片整理工具前，先进行磁盘碎片整理。而对于更智能的磁盘子系统上的规模较大的数据库，例如SAN(存储区域网络 storage area networks)环境，磁盘碎片整理就不是必要的。

索引在数据库占据了很重要的位置，索引需要牵涉到索引创建、优化和维护多方面的工作，本文以实例结合相关原理来介绍索引维护相关的知识。实现步骤：

1. 以什么标准判断索引是否需要维护?

2. 索引维护的方法有哪些?

3. 能否方便地整理出比较通用的维护过程，实现自动化维护?

（一）、以什么标准判断索引是否需要维护?

从索引维护的角度来讲，最主要的参考标准就是索引碎片的大小。通常情况下，索引碎片在10%以内，是可以接受的。下面介绍获取索引碎片的方法：

SQL Server 2000: DBCC SHOWCONTIG （显示指定的表的数据和索引的碎片信息）

执行 DBCC SHOWCONTIG 的结果如下：

DBCC SHOWCONTIG 正在扫描 'StoreContact' 表...
表: 'StoreContact' (30623152)；索引 ID: 1，数据库 ID: 6
已执行 TABLE 级别的扫描。
- 扫描页数................................: 5
- 扫描区数..............................: 2
- 区切换次数..............................: 1
- 每个区的平均页数........................: 2.5
- 扫描密度 [最佳计数:实际计数].......: 50.00% [1:2]
- 逻辑扫描碎片 ..................: 20.00%
- 区扫描碎片 ..................: 50.00%
- 每页的平均可用字节数........................: 1319.0
- 平均页密度(满).....................: 83.70%

DBCC SHOWCONTIG 正在扫描 'Address' 表...
表: 'Address' (53575229)；索引 ID: 1，数据库 ID: 6
已执行 TABLE 级别的扫描。
- 扫描页数................................: 278
- 扫描区数..............................: 35
- 区切换次数..............................: 34
- 每个区的平均页数........................: 7.9
- 扫描密度 [最佳计数:实际计数].......: 100.00% [35:35]
- 逻辑扫描碎片 ..................: 0.00%
- 区扫描碎片 ..................: 14.29%
- 每页的平均可用字节数........................: 79.1
- 平均页密度(满).....................: 99.02%

扫描页数（Page Scanned）：如果你知道行的近似尺寸和表或索引里的行数，那么你可以估计出索引里的页数。看看扫描页数，如果明显比你估计的页数要高，说明存在内部碎片。
扫描区数（Extents Scanned）：用扫描页数除以8,四舍五入到下一个最高值。该值应该和DBCC SHOWCONTIG返回的扫描扩展盘区数一致。如果返回的数高，说明存在外部碎片。碎片的严重程度依赖于刚才显示的值比估计值高多少。
扩展盘区切换次数（Extent Switches）：该数应该等于扫描区数减1。高了则说明有外部碎片。
每个扩展盘区的平均页数（Avg. Pages per Extent）：扫描页数除以扫描扩展盘区数，一般是8。小于8说明有外部碎片。
扫描密度［最佳值:实际值］（Scan Density [Best Count:Actual Count]）：DBCC SHOWCONTIG返回最有用的一个百分比。这是扩展盘区的最佳值和实际值的比率。该百分比应该尽可能靠近100％。低了则说明有外部碎片。

逻辑扫描碎片（Logical Scan Fragmentation）：无序页的百分比。该百分比应该在0％到10％之间，高则有外部碎片。
扩展盘区扫描碎片（Extent Scan Fragmentation）：无序扩展盘区在扫描索引叶级页中所占的百分比。该百分比应该是0％，高了则说明有外部碎片。
每页上的平均可用字节数（Avg. Bytes Free per Page）：所扫描的页上的平均可用字节数。越高说明有内部碎片，不过在你用这个数字决定是否有内部碎片之前，应该考虑fill factor（填充因子）。
平均页密度（完整）Avg. Page Density (full)：每页上的平均可用字节数的百分比的相反数。低百分比说明有内部碎片。

检查DBCC SHOWCONTIG运行后的结果时，需要特别留意逻辑扫描碎片(Logical Scan Fragmentation)和平均页密度(Average Page Density)。Logic scan fragmentattion表示索引上乱序的百分比(注意: 该数值和堆和文本索引不相关。所谓堆表示一个没有聚集索引的表。)

    碎片会影响I/O。因此要集中关注较大的索引，这些索引被SQL Server放入缓存的可能性较小。通过DBCC SHOWCONTIG得到的页数，可以估算出索引的大小(每页大小为8KB)。一般来说，没有必要关注那些碎片级别小于1,000页的索引。在测试中，包含超过10,000页的索引才会影响性能，特别是包含更多的页(超过50,000页)的索引，会引起最大的性能提升。
     逻辑扫描碎片(logical scan fragmentation)值太高，会大大降低索引扫描的性能。在测试中，那些逻辑碎片大于10%的聚集索引，在碎片整理后性能得到了提升；对那些大于20%的聚集索引，性能提升尤其明显。因此关注那些逻辑碎片大于等于20%的索引。注意，对于堆(Index ID=0)来说，该标准是无意义的。
     平均页密度(average page density)太低，将导致查询中需要读取更多的页。重新组织这些页，可以提高平均页密度，从而完成相同的查询只要读取较少的页。一般来说，在第一次载入数据后，表拥有较高的页密度。随着数据的插入，页密度会降低，从而带来叶级页拆分。检查平均页密度时，记住该值依赖于创建表时设置的填充因子取值。
     扫描密度(scan density)虽然可以作为碎片级别的参考，不过当索引跨越多个文件时，该参考无效。因此，当检查跨越多个文件的索引时，扫描密度不应该被考虑。

SQL Server 2005: sys.dm_db_index_physical_stats 在SQL Server 2005中，已经不建议使用DBCC SHOWCONTIG了，建议使用新的动态管理函数sys.dm_db_index_physical_stats返回能确定索引碎片级别的信息。语法如下：sys.dm_db_index_physical_stats (
    { database_id | NULL | 0 | DEFAULT }
  , { object_id | NULL | 0 | DEFAULT }
  , { index_id | NULL | 0 | -1 | DEFAULT }
  , { partition_number | NULL | 0 | DEFAULT }
  , { mode | NULL | DEFAULT } )

表28-5　sys.dm_db_index_physical_stats的参数

参　　数	描　　述
database_id \| NULL	要检测索引的数据库ID。如果为NULL，返回SQL Server实例中的所有数据库的信息
object_id \| NULL	要检测的表和视图（索引视图）的对象ID。如果为NULL，返回所有表的信息
index_id \| NULL \| 0	要检测的指定索引ID。如果为NULL，返回表中所有索引的信息
partition_number \| NULL	要检测的分区的指定分区编号。如果为NULL，返回基于已定义数据库／表／选择的索引的所有分区的信息
LIMITED \| SAMPLED \| DETAILED \| NULL \| DEFAULT	这些模式影响了如何收集碎片数据。LIMITED模式扫描堆所有的页，但对于索引，则只扫描叶级上面的父级别页。SAMPLED收集在堆或索引中1%采样率的数据。DETAILED模式扫描所有页（堆或索引）。DETAILED是执行最慢的，但也是最精确的选项。指定NULL或DEFAULT的效果与LIMITED模式的相同

获取数据库'adventureworks'的索引碎片：

select db_name(a.database_id)[db_name], --根据数据库ID得到数据库名称
c.name[table_name],                            --表名称
b.name [index_name],                           --索引名称
a.avg_fragmentation_in_percent             --碎片程度
from sys.dm_db_index_physical_stats(db_id('adventureworks'),Null,Null,Null,'limited') a
join sys.indexes b
on a.object_id=b.object_id and a.index_id=b.index_id
join sys.tables c
on c.object_id=a.object_id
where a.index_id>0
and avg_fragmentation_in_percent>20 --碎片程度大于20

结果：

db_name table_name index_name avg_fragmentation_in_percent

AdventureWorks ProductProductPhoto PK_ProductID_ProductPhotoID 50
AdventureWorks StoreContact AK_StoreContact_rowguid 66.66

AdventureWorks StoreContact IX_StoreContact_ContactID 50
AdventureWorks StoreContact IX_StoreContact_ContactTypeID 50

查询返回了数据库AdventureWorks中对象的碎片大于20%的几行。列avg_fragmentation_in_ percent显示聚集索引或非聚集索引的逻辑碎片，返回索引的叶级无序页的百分比。对于堆来说，avg_fragmentation_in_percent显示区级碎片。

（二）、索引维护的方法有哪些?

注：维护方式的选择，一方面要考虑是否是联机维护，另一方面就是速度上的考虑。一般碎片<=30%时，使用重新组织的方法速度比索引重建快;碎片>30%时，索引重建的速度比重新组织要快。决定是否用 DBCC DBREINDEX 还是 DBCC INDEXDEFRAG 取决于你的需求以及硬件环境。 DBCC DBREINDEX会带来更新统计(updating statistics)的副作用，而DBCC INDEXDEFRAG不会。可以通过在执行DBCC INDEXDEFRAG后执行UPDATE STATISTICS来增加其影响。

1. 联机维护

SQL Server2000:

DBCC INDEXDEFRAG 重新组织索引，占用资源少，锁定资源周期短，可联机进行。

DBCC INDEXDEFRAG (
    { database_name | database_id | 0 }
    , { table_name | table_id | view_name | view_id }
    [ , { index_name | index_id } [ , { partition_number | 0 } ] ] )
    [ WITH NO_INFOMSGS ]

SQL Server 2005:

1. 联机重新组织：

ALTER INDEX [index_name] ON [table_name]

REORGANIZE;

2. 联机重建：

ALTER INDEX [index_name] ON [table_name]

REBUILD WITH (FILLFACTOR = 85, SORT_IN_TEMPDB = OFF,

STATISTICS_NORECOMPUTE = ON,ONLINE = ON);

2. 脱机维护

SQL Server2000：DBCC DBREINDEX

DBCC DBREINDEX用于在指定的表上重建一个或多个索引。DBCC DBREINDEX是离线操作方式。当该操作运行时，涉及到的表就无法被用户访问。DBCC DBREINDEX动态地重建索引。没有必要知道参与重建的表结构到底如何，是否用主键或者唯一性约束等信息；重建的时候会自动管理的。DBCC DBREINDEX完全重建索引，也就是说，将页密度级别恢复到最初(默认)的填充因子水平；当然你也可以选择页密度的新值。从内部运行看，DBCC DBREINDEX和手工用T-SQL语句来运行删除然后重新创建索引十分相似。

1：DBCC DBREINDEX('表名',pk_索引名,100)

重做第一步，如发现扫描密度/Scan Density还是小于100%则重构表的所有索引，并不一定能达100%。

第二步:重构SQL Server数据库表所有索引

2：DBCC DBREINDEX('表名',’’,100)

表5 DBCC DBREINDEX 和 DBCC INDEXDEFRAG的比较

（详细见Microsoft SQL Server 2000 索引碎片整理最佳实践）
Functionality	DBCC DBREINDEX	DBCC INDEXDEFRAG
Online/Offline	Offline	Online
Faster when logical fragmentation is:	High	Low
Parallel processing	Yes	No
Compacts pages	Yes	Yes
Can be stopped and restarted without losing work completed to that point	No	Yes
Able to untangle interleaved indexes	May reduce interleaving	No
Additional free space is required in the data file for defragmenting	Yes	No
Faster on larger indexes	Yes	No
Rebuilds statistics	Yes	No
Log space usage	High in full recovery mode (logs entire contents of the index), low in bulk logged or simple recovery mode (only logs allocation of space)	Varies based on the amount of work performed
May skip pages on busy systems	No	Yes

SQL Server 2005：ALTER INDEX [indexname] ON [table_name] REBUILD;

CREATE INDEX WITH DROP_EXISTING

3.比较通用的维护过程，实现自动化维护

a) 获取及查看所有索引的碎片情况

[c-sharp] view plain copy

/*
描述：获取服务器上所有数据库的逻辑碎片率>5的索引信息
适用：SqlServer2000以后版本
*/
SET NOCOUNT ON
DECLARE @db_name varchar(128)
DECLARE @tablename varchar(128)
DECLARE @table_schema varchar(128)
DECLARE @execstr varchar(255)
DECLARE @objectid int
DECLARE @indexid int
DECLARE @frag decimal
DECLARE @sql varchar(8000)
DECLARE @maxfrag decimal
-- 设置索引碎片百分比
SELECT @maxfrag = 5
-- 创建表
if not exists(select 1 from sys.tables where name = 'dba_manage_index_defrag')
create table dba_manage_index_defrag
([db_name] varchar(255) default ''
,[table_name] varchar(255)
,[index_name] varchar(255)
,avg_fragmentation_in_percent real
,write_time datetime default getdate()
)
--创建表
if not exists(select 1 from dbo.sysobjects where name = 'dba_manage_index_defrag_temp')
CREATE TABLE dba_manage_index_defrag_temp (
[db_name] varchar(255)default '',
ObjectName varchar(255),
ObjectId int,
IndexName varchar(255),
IndexId int,
Lvl int,
CountPages int,
CountRows int,
MinRecSize int,
MaxRecSize int,
AvgRecSize int,
ForRecCount int,
Extents int,
ExtentSwitches int,
AvgFreeBytes int,
AvgPageDensity int,
ScanDensity decimal,
BestCount int,
ActualCount int,
LogicalFrag decimal,
ExtentFrag decimal)
--声明一个游标获取当前数据库
DECLARE databases CURSOR FOR
select name
from master.dbo.sysdatabases
where dbid>4
--打开游标，将DBCC获取得到的所有信息插入到临时表
open databases
fetch databases into @db_name
while (@@fetch_status=0)
begin
insert into dba_manage_index_defrag_temp
(
ObjectName ,
ObjectId ,
IndexName,
IndexId ,
Lvl ,
CountPages ,
CountRows ,
MinRecSize ,
MaxRecSize ,
AvgRecSize ,
ForRecCount ,
Extents ,
ExtentSwitches ,
AvgFreeBytes ,
AvgPageDensity ,
ScanDensity ,
BestCount ,
ActualCount ,
LogicalFrag ,
ExtentFrag )
exec('use ['+@db_name+'];
dbcc showcontig
with FAST,TABLERESULTS,ALL_INDEXES,NO_INFOMSGS')
--更新数据库名字
update dba_manage_index_defrag_temp
set [db_name] = @db_name
where [db_name] = ''
fetch next from databases into @db_name
end
--关必游标
close databases
deallocate databases
--插入到正式表中
insert into dba_manage_index_defrag
([db_name]
,[table_name]
,[index_name]
,avg_fragmentation_in_percent
)
select
[db_name],
ObjectName [table_name],
indexname [index_name],
LogicalFrag [avg_fragmentation_in_percent]
from dba_manage_index_defrag_temp
where logicalfrag>5
--删除临时表.
DROP TABLE dba_manage_index_defrag_temp
GO
SELECT * FROM dba_manage_index_defrag
--查看结果

b) 根据(a)查询结果得到的索引碎片的情况自动选择合适的处理方法

针对Sql Server2000的联机维护：

[c-sharp] view plain copy

/*Perform a 'USE <database name>' to select the database in which to run the script.*/
-- Declare variables
SET NOCOUNT ON;
DECLARE @tablename varchar(128);
DECLARE @execstr varchar(255);
DECLARE @objectid int;
DECLARE @indexid int;
DECLARE @frag decimal;
DECLARE @maxfrag decimal;
-- 碎片大小根据临界值采取不同的维护措施.
SELECT @maxfrag = 30.0;
-- 声明一个游标
DECLARE tables CURSOR FOR
SELECT TABLE_SCHEMA+'.'+TABLE_NAME --MSDN上面直接使用TABLE_NAME，如果SCHEMA不是DBO就会出错
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_TYPE = 'BASE TABLE';
-- 创建临时表.
CREATE TABLE #fraglist (
ObjectName char(255),
ObjectId int,
IndexName char(255),
IndexId int,
Lvl int,
CountPages int,
CountRows int,
MinRecSize int,
MaxRecSize int,
AvgRecSize int,
ForRecCount int,
Extents int,
ExtentSwitches int,
AvgFreeBytes int,
AvgPageDensity int,
ScanDensity decimal,
BestCount int,
ActualCount int,
LogicalFrag decimal,
ExtentFrag decimal);
-- 打开游标
OPEN tables;
-- 循环读取表
FETCH NEXT
FROM tables
INTO @tablename;
WHILE @@FETCH_STATUS = 0
BEGIN
--取出每张表的所有索引
INSERT INTO #fraglist
EXEC ('DBCC SHOWCONTIG (''' + @tablename + ''')
WITH FAST, TABLERESULTS, ALL_INDEXES, NO_INFOMSGS');
FETCH NEXT
FROM tables
INTO @tablename;
END;
-- 关闭游标
CLOSE tables;
DEALLOCATE tables;
--对每张表的索引碎片>5的执行对应索引碎片整理
DECLARE indexes CURSOR FOR
SELECT ObjectName, ObjectId, IndexId, LogicalFrag
FROM #fraglist
WHERE LogicalFrag >= @maxfrag
AND INDEXPROPERTY (ObjectId, IndexName, 'IndexDepth') > 0;
-- Open the cursor.
OPEN indexes;
-- Loop through the indexes.
FETCH NEXT
FROM indexes
INTO @tablename, @objectid, @indexid, @frag;
WHILE @@FETCH_STATUS = 0
BEGIN
PRINT 'Executing DBCC INDEXDEFRAG (0, ' + RTRIM(@tablename) + ',
' + RTRIM(@indexid) + ') - fragmentation currently '
+ RTRIM(CONVERT(varchar(15),@frag)) + '%';
SELECT @execstr = 'DBCC INDEXDEFRAG (0, ' + RTRIM(@objectid) + ',
' + RTRIM(@indexid) + ')';
EXEC (@execstr);
FETCH NEXT
FROM indexes
INTO @tablename, @objectid, @indexid, @frag;
END;
-- Close and deallocate the cursor.
CLOSE indexes;
DEALLOCATE indexes;
-- Delete the temporary table.
DROP TABLE #fraglist;
GO

针对SQL Server 2000的脱机维护：

sp_msforeachtable @command1="dbcc dbreindex('?','',85)"

SQL Server 2005的通用维护过程：(碎片小于30%的联机组织，碎片>=30%的脱机重建)

找出索引碎片程度大于 10% 的索引，然后根据索引碎片程度，分别来采取不同的方法来整理索引碎片。小于 30% 的使用 alter index reorganize；大于等于 30% 的使用 alter index rebuild。其中 reorganize 相当于 dbcc indexdefrag()；rebuild 相当于 dbcc dbreindex()。

SQL 碎片整理后，索引数据页在数据库文件中排列的更紧凑，可以大幅提高一些 SQL 查询的效率。DBA 可以每周进行一次碎片整理。另外要注意的是，不要在收缩数据库（dbcc shrinkfile, dbcc shrinkdatabase）前整理索引碎片。

[c-sharp] view plain copy

--------------------------------------------------------------------------------
-- ensure a USE statement has been executed first.
--------------------------------------------------------------------------------
set nocount on
declare @objectid int
,@indexid int
,@partitioncount bigint
,@schemaname sysname
,@objectname sysname
,@indexname sysname
,@partitionnum bigint
,@partitions bigint
,@frag float
,@command varchar(1000)
select objectid = object_id
,indexid = index_id
,partitionnum = partition_number
,frag = avg_fragmentation_in_percent
into #work_to_do
from sys.dm_db_index_physical_stats(db_id(), null, null , null, 'LIMITED')
where avg_fragmentation_in_percent > 10.0
and index_id > 0
-- declare the cursor for the list of partitions to be processed.
declare partitions cursor for
select * from #work_to_do
-- Open the cursor.
open partitions
-- Loop through the partitions.
fetch next from partitions into @objectid, @indexid, @partitionnum, @frag
while @@fetch_status = 0 begin
select @objectname = o.name, @schemaname = s.name
from sys.objects as o
inner join sys.schemas as s
on s.schema_id = o.schema_id
where o.object_id = @objectid
select @indexname = name
from sys.indexes
where object_id = @objectid
and index_id = @indexid
select @partitioncount = count (*)
from sys.partitions
where object_id = @objectid
and index_id = @indexid
-- 30 is an arbitrary decision point at which to switch between reorganizing and rebuilding
if @frag < 30.0 begin
select @command = 'alter index ' + @indexname + ' on '
+ @schemaname + '.' + @objectname + ' reorganize'
if @partitioncount > 1
select @command = @command + ' partition=' + convert(char, @partitionnum)
end
if @frag >= 30.0 begin
select @command = 'alter index ' + @indexname +' on '
+ @schemaname + '.' + @objectname + ' rebuild'
if @partitioncount > 1
select @command = @command + ' partition=' + convert(char, @partitionnum)
end
-- exec (@command)
print 'Executed: ' + @command
fetch next from partitions into @objectid, @indexid, @partitionnum, @frag
end
-- free resource
close partitions
deallocate partitions
drop table #work_to_do