TSM configuration for rman restores
Judging from the Oracle metalink and from several posts on the Internet configuring TSM for rman restores is a confusing and dangerous subject. We will not deal here with rman backups with tsm in depth, because they are quite straightforward.
In this example a working method, used tenths of times, is explained. Our purpose is to restore a database DBPROD, normally running on the node nodeprod, onto a server nodecopy. There is much confusing documentation about this subject; sometimes it is completely wrong and dangerous. The method here is not general, but safe.
The names that will be used in our example are:
| Original node |
Node for the copy |
Database |
tsm file |
| nodeprod |
nodecopy |
DBPROD |
tdpo.dbprod.opt |
Pragmatic approach to the rman restore
Murphy's theorem "If something can go wrong, it will" applies without exceptions to the tsm/rman restores. It is threfore better to have three or four scripts at hand.
- Optimistic script for the restore and recover
- Script for restoring a controlfile
- Script for restoring a datafile
- Script for recovering the database
Normally the first script will not run to the end for any reason (no space left, connection hanging etc.) leaving the work almost done. At that point you shouldn't start again from the beginning, but try recovering the database, probably getting an error message about a datafile to be restored etc. Restore the datafile an try again the recovery.
Preparing the environment on nodecopy
Supposing the on nodeprod you follow the OFA structure, the environment for DBPROD should be found under $ORACLE_BASE/admin/DBPROD Take a tar of this tree, copy it onto nodecopy and reproduce the environment.
- tar cvfp /tmp/DBPROD.tar $ORACLE_BASE/admin/DBPROD/*
- ftp nodecopy
bin
put DBPROD.tar
On nodecopy, untar
- tar xvfp /tmp/DBPROD.tar
- Copy the tsm/tdp configuration file /opt/tivoli/tsm/client/oracle/bin/tdpo.dbprod.opt onto the second server as it is. This is an important point: the tsm configuration file must be the same, even if there is another configuration file for the tsm client on the second node.
In contrary to several examples on the internet, our script does not use any auxiliary instance and, especially, doesn't connect to any target database. Connecting to a target database (which is the original one) with tsm causes many dangerous, because the result is that tsm will try to restore the database onto the server nodeprod, possibly overwriting the original database!
Before trying the restore, make sure you are under the same UNIX account as the one that took the backup! Is is of fundumental importance, otherwise you will get this kind of error messages:
RMAN-03022: compiling command: IRESTORE
RMAN-03023: executing command: IRESTORE
RMAN-08016: channel ch1: starting datafile backupset restore
RMAN-08502: set_count=3524 set_stamp=527391926 creation_time=01:45:26 29-MAY-2004
RMAN-08089: channel ch1: specifying datafile(s) to restore from backup set
RMAN-08523: restoring datafile 00003 to /u05/oradata/PCHIPBOA/users01PCHIPBOA.dbf
RMAN-08523: restoring datafile 00004 to /u05/oradata/PCHIPBOA/conversion_01_PCHIPBOA.dbf
RMAN-08523: restoring datafile 00005 to /u02/oradata/PCHIPBOA/dlwk_01_PCHIPBOA.dbf
RMAN-03026: error recovery releasing channel resources
RMAN-08031: released channel: ch1
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure during compilation of command
RMAN-03013: command type: restore
RMAN-03007: retryable error occurred during execution of command: IRESTORE
RMAN-07004: unhandled exception during command execution on channel ch1
RMAN-10035: exception raised in RPC: ORA-19507: failed to retrieve sequential file, handle="e4fmun5m_1_1", parms=""
ORA-27029: skgfrtrv: sbtrestore returned error
RMAN-10031: ORA-19624 occurred during call to DBMS_BACKUP_RESTORE.RESTOREBACKUPPIECE
RMAN>
1. The first action of the script is setting up the environment, which can be done in several ways. In this example a configuration file ~oracle/bin/DBPROD is sourced in, whose contents can be:
export ORA_NLS32=/u00/app/oracle/product/8.1.7/ocommon/nls/admin/data
export ORA_NLS=/u00/app/oracle/product/8.1.7/ocommon/nls/admin/data
export ORACLE_BASE=/u00/app/oracle
ORACLE_HOME=$ORACLE_BASE/product/8.1.7
ORACLE_SID=DBPROD
ORACLE_TERM=xsun5
2. The local instance is then started in nomount mode (there is no controlfile available).
3. The script connects to the rman catalog.
#!/bin/ksh
#
# The script restores and recovers an rman backup
#
echo "Setting environment for DBPROD ... "
. ~oracle/bin/DBPROD
echo "Shutting down the new instance (just in case ...)"
sqlplus /nolog<<EOF_SQL_1
connect / as sysdba
shutdown immediate
exit
EOF_SQL_1
echo "\nStarting the local instance in nomount mode ... "
sqlplus /nolog<<EOF_SQL
connect / as sysdba
startup nomount
exit
EOF_SQL
#
echo "Connecting with rman to the repository and locally ... "
rman catalog "rman/rmanpwd@REPOS" target "/" <<EOF_RMAN
run
{
set archivelog destination to '/o27/oradata/DBPROD';
allocate channel ch1 type 'sbt_tape' parms 'ENV=(TDPO_OPTFILE=/opt/tivoli/tsm/client/oracle/bin/tdpo.dbprod.opt)';
restore database;
restore controlfile to '/u22/oradata/DBPROD/ctl1DBPROD.dbf';
replicate controlfile from '/u22/oradata/DBPROD/ctl1DBPROD.dbf';
sql 'alter database mount';
recover database;
}
EOF_RMAN
Here are some of the possible errors you can come across:
- Directory does not exist. Well, create it, or use a softlink.
- No space left on device. Create soft links to another device or use the "set newname", which has the effect of a rename or move.
run
{
allocate channel ch1 type 'sbt_tape' parms 'ENV=(TDPO_OPTFILE=/opt/tivoli/tsm/client/oracle/bin/tdpo.dbprod.opt)';
allocate auxiliary channel ch2 type disk;
set newname for datafile 1 to '/tmp/system01DBPROD.dbf';
set newname for datafile 2 to '/o04/oradata/DBPROD/tbsxx_cwcspidx4_01.dbf';
set newname for datafile 3 to '/o02/oradata/DBPROD/tbsxx_cwcsptab4_01.dbf';
set newname for datafile 4 to '/o01/oradata/DBPROD/tbsxx_tab11.dbf';
set newname for datafile 5 to '/o01/oradata/DBPROD/tbsxx_tab12.dbf';
set newname for datafile 6 to '/o01/oradata/DBPROD/tbsxx_tab13.dbf';
set newname for datafile 7 to '/o01/oradata/DBPROD/tbsxx_tab16.dbf';
set newname for datafile 8 to '/o01/oradata/DBPROD/tbsxx_tab17.dbf';
set newname for datafile 9 to '/o01/oradata/DBPROD/tbsxx_tab18.dbf';
... ... ...
restore database;
release channel ch1;
release channel ch2;
}
ORA-16032: parameter LOG_ARCHIVE_DEST destination string cannot be translated
This means that the definition of LOG_ARCHIVE_DEST in your init.ora references a directory that does not exist on the target server. Please change the destination.
SQL> ORA-16032: parameter LOG_ARCHIVE_DEST destination string cannot be translated
ORA-07286: sksagdi: cannot obtain device information.
SVR4 Error: 2: No such file or directory
SQL> Disconnected
oracle hasn't been linked with the correct tape library libobk.so
RMAN-03022: compiling command: allocate
RMAN-03023: executing command: allocate
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03007: retryable error occurred during execution of command: allocate
RMAN-07004: unhandled exception during command execution on channel ch1
RMAN-10035: exception raised in RPC: ORA-19554: error allocating device, device type: SBT_TAPE, device name:
ORA-19557: device error, device type: SBT_TAPE, device name:
ORA-27000: skgfqsbi: failed to initialize storage subsystem (SBT) layer
Additional information: 4110
ORA-19511: SBT error = 4110, errno = 0, BACKUP_DIR environment variable is not set
RMAN-10031: ORA-19624 occurred during call to DBMS_BACKUP_RESTORE.DEVICEALLOCATE
The solution is creating a symbolik link and relink the executables
cd $ORACLE_HOME/lib
... ln -s /opt/tivoli/tsm/client/oracle/bin32/libobk.so ./libobk.so
... pwd
/u00/app/oracle/product/8.1.7/lib
... cd ../bin
... relink all
RMAN-11001: Oracle Error: ORA-01991: invalid password file
RMAN-03022: compiling command: sql
RMAN-06162: sql statement: alter database mount
RMAN-03023: executing command: sql
RMAN-03026: error recovery releasing channel resources
RMAN-08031: released channel: ch1
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03006: non-retryable error occurred during execution of command: sql
RMAN-07004: unhandled exception during command execution on channel default
RMAN-20000: abnormal termination of job step
RMAN-11003: failure during parse/execution of SQL statement: alter database mount
RMAN-11001: Oracle Error: ORA-01991: invalid password file '/u00/app/oracle/product/8.1.7/dbs/orapwDBPROD'
The easiest solution is commenting the parameter remote_password_file=exclusive in initDBPROD.ora
Try the recover after a doubtful restore
rman is more solid and tolerant that normally thought; if the restore/recover wasn't without errors, try to recover the database. The "restore" is the operation that copies back a kind of seed database, in an unconsistent status; the recovery makes it consistent.
#!/bin/ksh
#
# The script recovers a database that has already been restored
#
echo "Setting up the environment for DBPROD ... "
. ~oracle/bin/DBPROD
echo "Shutting down the local instance (just in case ...)"
sqlplus /nolog<<EOF_SQL_1
connect / as sysdba
shutdown immediate
exit
EOF_SQL_1
echo "\nStarting the local instance in nomount mode ... "
sqlplus /nolog<<EOF_SQL
connect / as sysdba
startup nomount
exit
EOF_SQL
echo "Connecting to rman repositorry and to the local instance ... "
rman catalog "rman/rmanpwd@REPOS" target "/" <<EOF_RMAN
set until time "to_date('2004/06/17 12:40:00','YYYY/MM/DD HH24:MI:SS')";
run
{
set archivelog destination to '/o27/oradata/DBPROD';
allocate channel ch1 type 'sbt_tape' parms 'ENV=(TDPO_OPTFILE=/opt/tivoli/tsm/client/oracle/bin/tdpo.dbprod.opt)';
sql 'alter database mount';
recover database;
}
EOF_RMAN
You may get some kind of error such as:
RMAN-03022: compiling command: recover(2)
RMAN-03026: error recovery releasing channel resources
RMAN-08031: released channel: ch1
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure during compilation of command
RMAN-03013: command type: recover
RMAN-03002: failure during compilation of command
RMAN-03013: command type: recover(2)
RMAN-06094: datafile 10 must be restored
In that case the complain is clear; restore the datafile 10:
recoverDbFile.ksh
#!/bin/ksh
#
# The script restores a database file
#
echo "Setting up the environment for DBPROD ... "
. ~oracle/bin/DBPROD
echo "Shutting down the local instance (just in case ...)"
sqlplus /nolog<<EOF_SQL_1
connect / as sysdba
shutdown immediate
exit
EOF_SQL_1
echo "\nStarting the local instance in nomount mode ... "
sqlplus /nolog<<EOF_SQL
connect / as sysdba
startup nomount
exit
EOF_SQL
rman catalog "rman/rmanpwd@REPOS" target "/" <<EOF_RMAN
run
{
set archivelog destination to '/o27/oradata/DBPROD';
allocate channel ch1 type 'sbt_tape' parms 'ENV=(TDPO_OPTFILE=/opt/tivoli/tsm/client/oracle/bin/tdpo.dbprod.opt)';
restore datafile 10;
}
EOF_RMAN
At this point you can go for another "recover".
... ...
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure during compilation of command
RMAN-03013: command type: recover
RMAN-03002: failure during compilation of command
RMAN-03013: command type: recover(4)
RMAN-06053: unable to perform media recovery because of missing log
RMAN-06025: no backup of log thread 1 seq 17013 scn 4679540745415 found to restore
ERROR at line 1:
ORA-01152: file 1 was not restored from a sufficiently old backup
ORA-01110: data file 1: '/u21/oradata/DBPROD/system_01DBPROD.dbf'
This error means that the not all the redo logs have been applied. Remember that the recovery is always uncomplete, because the last online redo log on the original database is certainly not available.
Use the set until time "to_date('2004/05/03 16:53:00','YYYY/MM/DD HH24:MI:SS')"; clause, specifying some time in the past certainly covered by the available archived redo logs.
... ... ...
RMAN-08022: channel ch1: restoring archivelog
RMAN-08510: archivelog thread=1 sequence=16993
RMAN-08022: channel ch1: restoring archivelog
RMAN-08510: archivelog thread=1 sequence=16994
RMAN-08023: channel ch1: restored backup piece 1
RMAN-08511: piece handle=ijfofmct_1_1 tag=null params=NULL
RMAN-08024: channel ch1: restore complete
RMAN-08515: archivelog filename=/o27/oradata/DBPROD/DBPROD_1_16993.arc
thread=1 sequence=16993
... ... ...
RMAN-08023: channel ch1: restored backup piece 1
RMAN-08511: piece handle=jdfohpuk_1_1 tag=null params=NULL
RMAN-08024: channel ch1: restore complete
RMAN-08515: archivelog filename=/o27/oradata/DBPROD/DBPROD_1_17008.arc thread=1 sequence=17008
RMAN-08515: archivelog filename=/o27/oradata/DBPROD/DBPROD_1_17009.arc thread=1 sequence=17009
RMAN-08018: channel ch1: starting archivelog restore to user-specified destination
RMAN-08508: archivelog destination=/o27/oradata/DBPROD
RMAN-08022: channel ch1: restoring archivelog
RMAN-08510: archivelog thread=1 sequence=17010
RMAN-08022: channel ch1: restoring archivelog
RMAN-08510: archivelog thread=1 sequence=17011
RMAN-08023: channel ch1: restored backup piece 1
RMAN-08511: piece handle=jefohq6b_1_1 tag=null params=NULL
RMAN-08024: channel ch1: restore complete
... ... ...
RMAN-08515: archivelog filename=/o27/oradata/DBPROD/DBPROD_1_17016.arc thread=1 sequence=17016
RMAN-08515: archivelog filename=/o27/oradata/DBPROD/DBPROD_1_17017.arc thread=1 sequence=17017
RMAN-08060: unable to find archivelog
RMAN-08510: archivelog thread=1 sequence=17018
RMAN-03026: error recovery releasing channel resources
RMAN-08031: released channel: ch1
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure during compilation of command
RMAN-03013: command type: recover
RMAN-03006: non-retryable error occurred during execution of command: recover(4)
RMAN-07004: unhandled exception during command execution on channel default
RMAN-20000: abnormal termination of job step
RMAN-06054: media recovery requesting unknown log: thread 1 scn 4679551971795
The last error is absolutely normal and it simply means that the online redo log of the source database is not available; every recovery is on point of time and we can open the database with resetlogs.
SQL> startup mount
SQL> ORACLE instance started.
Total System Global Area 550826144 bytes
Fixed Size 73888 bytes
Variable Size 426102784 bytes
Database Buffers 122880000 bytes
Redo Buffers 1769472 bytes
Database mounted.
SQL> alter database open resetlogs;
Database altered.
How to monitor what rman is doing
The view v$session_longops is available, which can be queried when rman is performing long backups or restores:
SQL> descr v$session_longops
Name Null? Type
----------------------------------------- -------- ----------------------------
SID NUMBER
SERIAL# NUMBER
OPNAME VARCHAR2(64)
TARGET VARCHAR2(64)
TARGET_DESC VARCHAR2(32)
SOFAR NUMBER
TOTALWORK NUMBER
UNITS VARCHAR2(32)
START_TIME DATE
LAST_UPDATE_TIME DATE
TIME_REMAINING NUMBER
ELAPSED_SECONDS NUMBER
CONTEXT NUMBER
MESSAGE VARCHAR2(512)
USERNAME VARCHAR2(30)
SQL_ADDRESS RAW(4)
SQL_HASH_VALUE NUMBER
QCSID NUMBER
SQL>
alter session set nls_date_format = 'hh24:mi:ss dd-mon-yyyy';
Session altered.
set line 140
column target format a10
column opname format a32
column sid format 999
SELECT sid, opname, target, start_time, last_update_time,
ROUND(sofar*100/totalwork,1) AS pct, time_remaining AS secs
FROM v$session_longops
ORDER BY start_time;
SID OPNAME TARGET START_TIME LAST_UPDATE_TIME PCT SECS
---------- --------------------------------------------------- -------------------- --------------------
10 RMAN: full datafile restore 3524 14:02:04 18-JUN-2004 14:07:56 18-JUN-2004 100 0
10 RMAN: full datafile restore 3525 14:07:59 18-JUN-2004 14:08:36 18-JUN-2004 10.4 319
SID OPNAME TARGET START_TIME LAST_UPDATE_TIME PCT
11 RMAN: incremental datafile restore 3684 12:13:50 17-jun-2004 12:14:02 17-jun-2004 100
11 RMAN: incremental datafile restore 3685 12:14:24 17-jun-2004 12:15:13 17-jun-2004 100
11 RMAN: incremental datafile restore 3686 12:15:19 17-jun-2004 12:15:59 17-jun-2004 100
11 RMAN: incremental datafile restore 3687 12:16:55 17-jun-2004 12:17:44 17-jun-2004 100
The target 3684, 3685 etc. refer to the set_count which is currently being restored:
RMAN-08016: channel ch1: starting datafile backupset restore
RMAN-08502: set_count=3658 set_stamp=528944412 creation_time=01:00:12 16-jun-2004
RMAN-08089: channel ch1: specifying datafile(s) to restore from backup set
RMAN-08523: restoring datafile 00010 to /u26/oradata/DBPROD/idx_01DBPROD_INDX.dbf
RMAN-08023: channel ch1: restored backup piece 1
RMAN-08511: piece handle=iafoe38s_1_1 tag=null params=NULL
RMAN-08024: channel ch1: restore complete
RMAN-08031: released channel: ch1
RMAN-08509: destination for restore of datafile 00010: /u26/oradata/DBPROD/idx_01DBPROD_INDX.dbf
RMAN-08509: destination for restore of datafile 00028: /u21/oradata/DBPROD/data_05DBPROD_DATA.dbf
RMAN-08023: channel ch1: restored backup piece 1
RMAN-08511: piece handle=j5foh37o_1_1 tag=null params=NULL
RMAN-08024: channel ch1: restore complete
RMAN-08039: channel ch1: starting incremental datafile backupset restore
RMAN-08502: set_count=3686 set_stamp=529043127 creation_time=04:25:27 17-jun-2004
RMAN-08089: channel ch1: specifying datafile(s) to restore from backup set
RMAN-08509: destination for restore of datafile 00022: /u25/oradata/DBPROD/evt_x_1DBPROD.dbf
RMAN-08023: channel ch1: restored backup piece 1
Conclusions
We are confident this method will solve all your problems with the rman/tsm rstores on a different server. We are selling our complete foolproof recovery-restore script for 75$. You will not run the risk of overwriting your original database because the scrip has been used many times on different configurations.
If you have questions and need a consultancy, send an email to support@front-row-seat.com
|
|