一.前言
|
Monday | differential incremental level 1 | backup archivelog |
Tuesday | differential incremental level 1 | backup archivelog |
Wednsday | differential incremental level 1 | backup archivelog |
Thusday | differential incremental level 1 | backup archivelog |
Friday | differential incremental level 1 | backup archivelog |
Saturday | differential incremental level 1 | backup archivelog |
Sumday | cumulative incremental level 1 | backup archivelog |
每月的第一天做上个月的零级全备。
这样在任意时间需要恢复数据是,最多用到一个全备+一个周备+7个日备+一天之内的归档日志。
特别注意的是:如果生产库非常大,那么强烈建议使用block_change_tracking特性,能极大缩短增量备份时间,否则做一次增量备份的时间与做一次全备所需时间无异!
OS: Linux5.8 x64 DB:Oracle 11.2.0.3 x64
主库:test3_A 备库test3_B 且正常运行中
test3_B:
SQL> archive log list
Database log mode Archive Mode
Automatic archival Enabled
Archive destination /u01/archivelogs/test3
Oldest online log sequence 45
Next log sequence to archive 0
Current log sequence 59
SQL> alter database recover managed standby database cancel;
Database altered.
SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL>
test3_A:
SQL> archive log list
Database log mode Archive Mode
Automatic archival Enabled
Archive destination /u01/archivelog/test3
Oldest online log sequence 57
Next log sequence to archive 59
Current log sequence 59
SQL> alter system switch logfile;
System altered.
SQL> /
System altered.
SQL> /
System altered.
SQL> archive log list
Database log mode Archive Mode
Automatic archival Enabled
Archive destination /u01/archivelog/test3
Oldest online log sequence 60
Next log sequence to archive 62
Current log sequence 62
SQL> quit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
[oracle@redhat ~]$ rman target /
Recovery Manager: Release 11.2.0.3.0 - Production on Mon Jun 10 21:31:04 2013
Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
connected to target database: TEST3 (DBID=915537947)
RMAN> list archivelog low sequence 60 high sequence 62;
using target database control file instead of recovery catalog
List of Archived Log Copies for database with db_unique_name TEST3_A
=====================================================================
Key Thrd Seq S Low Time
------- ---- ------- - ---------
468 1 60 A 10-JUN-13
Name: /u01/archivelog/test3/1_60_800115694.dbf
469 1 61 A 10-JUN-13
Name: /u01/archivelog/test3/1_61_800115694.dbf
RMAN> delete archivelog low sequence 60 high sequence 61; (相比10g DG的改进是,没有被applied的归档,是不能仅仅被delete命令删除的,必须使用delete force删除)
released channel: ORA_DISK_1
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=55 device type=DISK
RMAN-08120: WARNING: archived log not deleted, not yet applied by standby
archived log file name=/u01/archivelog/test3/1_60_800115694.dbf thread=1 sequence=60
RMAN-08120: WARNING: archived log not deleted, not yet applied by standby
archived log file name=/u01/archivelog/test3/1_61_800115694.dbf thread=1 sequence=61
RMAN> delete force archivelog low sequence 60 high sequence 61;
released channel: ORA_DISK_1
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=55 device type=DISK
List of Archived Log Copies for database with db_unique_name TEST3_A
=====================================================================
Key Thrd Seq S Low Time
------- ---- ------- - ---------
468 1 60 A 10-JUN-13
Name: /u01/archivelog/test3/1_60_800115694.dbf
469 1 61 A 10-JUN-13
Name: /u01/archivelog/test3/1_61_800115694.dbf
Do you really want to delete the above objects (enter YES or NO)?
yes
deleted archived log
archived log file name=/u01/archivelog/test3/1_60_800115694.dbf RECID=468 STAMP=817766882
deleted archived log
archived log file name=/u01/archivelog/test3/1_61_800115694.dbf RECID=469 STAMP=817766885
Deleted 2 objects
RMAN> quit
Recovery Manager complete.
[oracle@redhat ~]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.3.0 Production on Mon Jun 10 21:48:31 2013
Copyright (c) 1982, 2011, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL> alter system switch logfile;
System altered.
SQL> /
System altered.
test3_B:
SQL> startup
ORACLE instance started.
Total System Global Area 622149632 bytes
Fixed Size 2230912 bytes
Variable Size 180356480 bytes
Database Buffers 436207616 bytes
Redo Buffers 3354624 bytes
Database mounted.
Database opened.
SQL> alter database recover managed standby database using current logfile disconnect from session;
Database altered.
此时警告日志文件中出现:
Mon Jun 10 21:48:27 2013
FAL[client]: Failed to request gap sequence
GAP - thread 1 sequence 60-61
DBID 915537947 branch 800115694
FAL[client]: All defined FAL servers have been attempted.
------------------------------------------------------------
Check that the CONTROL_FILE_RECORD_KEEP_TIME initialization
parameter is defined to a value that's sufficiently large
enough to maintain adequate log switch information to resolve
archivelog gaps.
------------------------------------------------------------
Mon Jun 10 21:48:40 2013
RFS[2]: Selected log 5 for thread 1 sequence 64 dbid 915537947 branch 800115694
Mon Jun 10 21:48:40 2013
Archived Log entry 76 added for thread 1 sequence 63 ID 0x37014efb dest 1:
RFS[2]: Selected log 4 for thread 1 sequence 65 dbid 915537947 branch 800115694
Mon Jun 10 21:48:44 2013
Archived Log entry 77 added for thread 1 sequence 64 ID 0x37014efb dest 1:
然后查看一下日志gap的范围,发现正好是在primary强行删掉的两个归档日志:
SQL> select * from v$archive_gap;
THREAD# LOW_SEQUENCE# HIGH_SEQUENCE#
---------- ------------- --------------
1 60 61
归档日志的gap,具体来说就是从某个scn开始,备库没有可用的归档日志进行apply。但是尽管如此,后来的归档日志仍然会自动传输到standby端,但是却不能被应用,也就不能被rman的备份策略自动删除(force除外,但是我相信生产系统几乎不会用到它),进而导致空间占满问题;相应的primary端的日志同样不能被正常删除,也会遇到空间问题,然后可能hang住。这对于磁盘空间本不富裕,且日志产生频繁的系统就是一颗定时炸弹!
那么先来查看一下standby (test3_B)端截至目前的scn:
SQL> select current_scn from v$database;
CURRENT_SCN
-----------
3221814
那么意味着,我们必须至少将standby的scn继续向后推动到primary在gap之后的最早的一个日志(可能是在线日志,也可能是归档日志)的first_change#,才能继续应用日志! 因此接下来需要在primary端做一次 基于SCN起始的增量备份
在test3_A做基于SCN的增量备份,该备份包含了从3221814起,截至数据文件当前时间点,所有变化的数据块。如果此时primary没有全备也没关系。但是如果有block_change_tracking的话,能显著缩短这一过程的时间!
注意下面会同时生成一份最新的standby controlfile的备份集。这是一个自定义跨度的增量备份,因此differential 或者cumulative或者level都不需要指定了!
test3_A:
RMAN>run
{}
注:或者加上tag能够方便定位具体备份集
run
[oracle@redhat ~]$ cd /u01/rman/test3/
[oracle@redhat test3]$ ls -l
total 34564
-rw-r----- 1 oracle dba 11600384 Jun 10 22:17 block_tracking.log
-rw-r----- 1 oracle dba 10420224 Nov 23 2012 c-915537947-20121123-00
-rw-r----- 1 oracle dba 11051008 Jun 10 22:16 db_25obsb96_20130610.bk
-rw-r----- 1 oracle dba 1130496 Jun 10 22:16 db_26obsb9l_20130610.bk
-rw-r----- 1 oracle dba 1130496 Jun 10 22:16 db_27obsb9p_20130610.bk
[oracle@redhat test3]$ scp *20130610.bk oracle@10.0.0.21:/u01/rman/test3/
oracle@10.0.0.21's password:
db_25obsb96_20130610.bk 100% 11MB 10.5MB/s 00:01
db_26obsb9l_20130610.bk 100% 1104KB 1.1MB/s 00:00
db_27obsb9p_20130610.bk 100% 1104KB 1.1MB/s 00:00
[oracle@redhat test3]$
最后确定主库在此scn后是否添加过数据文件,如果有则需要在standby端也手动添加:
SQL> select FILE#,name from v$datafile where CREATION_CHANGE#> =3221814;
no rows selected
SQL> alter database recover managed standby database cancel;
Database altered.
RMAN> shutdown immediate
using target database control file instead of recovery catalog
database closed
database dismounted
Oracle instance shut down
RMAN> startup mount
connected to target database (not started)
Oracle instance started
database mounted
Total System Global Area 622149632 bytes
Fixed Size 2230912 bytes
Variable Size 180356480 bytes
Database Buffers 436207616 bytes
Redo Buffers 3354624 bytes
RMAN> catalog start with '/u01/rman/test3';
searching for all files that match the pattern /u01/rman/test3
List of Files Unknown to the Database
=====================================
File Name: /u01/rman/test3/db_25obsb96_20130610.bk
File Name: /u01/rman/test3/db_27obsb9p_20130610.bk
File Name: /u01/rman/test3/db_36nm49ec_20120925.bk
File Name: /u01/rman/test3/c-915537947-20120925-00
File Name: /u01/rman/test3/c-915537947-20121030-01
File Name: /u01/rman/test3/c-915537947-20121106-00
File Name: /u01/rman/test3/db_37nm49go_20120925.bk
File Name: /u01/rman/test3/c-915537947-20121030-02
File Name: /u01/rman/test3/db_26obsb9l_20130610.bk
File Name: /u01/rman/test3/arc_35nm49e8_20120925.bk
File Name: /u01/rman/test3/arc_38nm49gr_20120925.bk
Do you really want to catalog the above files (enter YES or NO)? yes
cataloging files...
cataloging done
List of Cataloged Files
=======================
File Name: /u01/rman/test3/db_25obsb96_20130610.bk
File Name: /u01/rman/test3/db_27obsb9p_20130610.bk
File Name: /u01/rman/test3/db_36nm49ec_20120925.bk
File Name: /u01/rman/test3/c-915537947-20120925-00
File Name: /u01/rman/test3/c-915537947-20121030-01
File Name: /u01/rman/test3/c-915537947-20121106-00
File Name: /u01/rman/test3/db_37nm49go_20120925.bk
File Name: /u01/rman/test3/c-915537947-20121030-02
File Name: /u01/rman/test3/db_26obsb9l_20130610.bk
File Name: /u01/rman/test3/arc_35nm49e8_20120925.bk
File Name: /u01/rman/test3/arc_38nm49gr_20120925.bk
RMAN> recover database noredo;
Starting recover at 10-JUN-13
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=26 device type=DISK
channel ORA_DISK_1: starting incremental datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
destination for restore of datafile 00001: /u01/oradata/test3/system01.dbf
destination for restore of datafile 00002: /u01/oradata/test3/undotbs01.dbf
destination for restore of datafile 00003: /u01/oradata/test3/sysaux01.dbf
destination for restore of datafile 00004: /u01/oradata/test3/users01.dbf
destination for restore of datafile 00005: /u01/oradata/test3/example01.dbf
channel ORA_DISK_1: reading from backup piece /u01/rman/test3/db_25obsb96_20130610.bk
channel ORA_DISK_1: piece handle=/u01/rman/test3/db_25obsb96_20130610.bk tag=TAG20130610T221605
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:23
Finished recover at 10-JUN-13
RMAN> quit
Recovery Manager complete.
test3_B:
[oracle@redhat1 ~]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.3.0 Production on Mon Jun 10 22:38:51 2013
Copyright (c) 1982, 2011, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL> select current_scn from v$database;
CURRENT_SCN
-----------
3221814
test3_A:
SQL> select thread#,sequence#,first_change#,next_change# from v$archived_log where first_change#<3221814 and next_change#>3221814;
THREAD# SEQUENCE# FIRST_CHANGE# NEXT_CHANGE#
---------- ---------- ------------- ------------
1 59 3221114 3221815
1 59 3221114 3221815
test3_B:
RMAN> restore standby controlfile from '/u01/rman/test3/db_27obsb9p_20130610.bk'; (一定要恢复standby控制文件,否则standby DB不能跳过原有gap的信息)
Starting restore at 10-JUN-13
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=19 device type=DISK
channel ORA_DISK_1: restoring control file
channel ORA_DISK_1: restore complete, elapsed time: 00:00:07
output file name=/u02/test3/control01.ctl
output file name=/u03/test3/control02.ctl
Finished restore at 10-JUN-13
RMAN> quit
Recovery Manager complete.
[oracle@redhat1 ~]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.3.0 Production on Mon Jun 10 23:13:10 2013
Copyright (c) 1982, 2011, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL> alter database mount;
Database altered.
SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-10458: standby database requires recovery
ORA-01152: file 1 was not restored from a sufficiently old backup
ORA-01110: data file 1: '/u01/oradata/test3/system01.dbf'
SQL> recover standby database using backup controlfile;
ORA-00279: change 3230822 generated at 06/10/2013 22:16:06 needed for thread 1
ORA-00289: suggestion : /u01/archivelogs/test3/1_66_800115694.dbf
ORA-00280: change 3230822 for thread 1 is in sequence #66
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
auto
ORA-00279: change 3232949 generated at 06/10/2013 22:41:24 needed for thread 1
ORA-00289: suggestion : /u01/archivelogs/test3/1_67_800115694.dbf
ORA-00280: change 3232949 for thread 1 is in sequence #67
ORA-00278: log file '/u01/archivelogs/test3/1_66_800115694.dbf' no longer
needed for this recovery
ORA-00308: cannot open archived log '/u01/archivelogs/test3/1_67_800115694.dbf'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3
SQL> alter database open;
Database altered.
SQL> alter database recover managed standby database using current logfile disconnect from session;
Database altered.
此时standby的警告日志信息,发现DG已经可以越过gap,开始从67号日志开始apply:
Mon Jun 10 23:17:22 2013
Primary database is in MAXIMUM PERFORMANCE mode
Changing standby controlfile to MAXIMUM PERFORMANCE mode
RFS[1]: Assigned to RFS process 26846
RFS[1]: Selected log 4 for thread 1 sequence 68 dbid 915537947 branch 800115694
Mon Jun 10 23:17:25 2013
RFS[2]: Assigned to RFS process 26850
RFS[2]: Selected log 5 for thread 1 sequence 67 dbid 915537947 branch 800115694
Mon Jun 10 23:17:28 2013
Archived Log entry 1 added for thread 1 sequence 67 ID 0x37014efb dest 1:
Mon Jun 10 23:18:59 2013
alter database recover managed standby database using current logfile disconnect from session
Attempt to start background Managed Standby Recovery process (test3)
Mon Jun 10 23:18:59 2013
MRP0 started with pid=27, OS id=26857
MRP0: Background Managed Standby Recovery process started (test3)
Serial Media Recovery started
Managed Standby Recovery starting Real Time Apply
Waiting for all non-current ORLs to be archived...
All non-current ORLs have been archived.
Media Recovery Log /u01/archivelogs/test3/1_67_800115694.dbf
Completed: alter database recover managed standby database using current logfile disconnect from session
Media Recovery Waiting for thread 1 sequence 68 (in transit)
Recovery of Online Redo Log: Thread 1 Group 4 Seq 68 Reading mem 0
Mem# 0: /u02/test3/standby_redo04a.log
Mem# 1: /u03/test3/standby_redo04b.log
至此恢复成功!
1. 如果使用手动压缩全备,并使用dulipcate target database for standy database的重建方式,即使有增量备份传输到standby,也不会被自动应用。
2. 如果先恢复standby controlfile, 就不用做catalog start with,接下来就能做recover database了。
3. 这是一个真实的案例!这个问题在生产中遇到时,才会发现制约因素很多,远没有本机测试环境那么顺利。经验的积累很重要!
|
来自: 浸心阁 > 《rman & dataguard》