Oracle® Database Backup and Recovery Advanced User's Guide 10g Release 2 (10.2) Part Number B14191-02 |
|
|
View PDF |
There are several ways to terminate an RMAN command in the middle of execution:
The preferred method is to press CTRL+C
(or the equivalent "attention" key combination for your system) in the RMAN interface. This will also terminates allocated channels, unless they are hung in the media management code, as happens when, for example, when they are waiting for a tape to be mounted.
You can kill the server session corresponding to the RMAN channel by running the SQL ALTER
SYSTEM KILL SESSION
statement.
You can terminate the server session corresponding to the RMAN channel on the operating system.
You can identify the Oracle session ID for an RMAN channel by looking in the RMAN log for messages with the format shown in the following example:
channel ch1: sid=15 devtype=SBT_TAPE
The sid
and devtype
are displayed for each allocated channel. Note that the Oracle sid
is different from the operating system process ID. You can kill the session using a SQL ALTER
SYSTEM
KILL
SESSION
statement.
ALTER SYSTEM KILL SESSION
takes two arguments, the sid
printed in the RMAN message and a serial number, both of which can be obtained by querying V$SESSION
. For example, run the following statement, where sid_in_rman_output
is the number from the RMAN message:
SELECT SERIAL# FROM V$SESSION WHERE SID=sid_in_rman_output;
Then, run the following statement, substituting the sid_in_rman_output
and serial number obtained from the query:
ALTER SYSTEM KILL SESSION 'sid_in_rman_output,serial#';
Note that this will not unhang the session if the session is hung in media manager code..
Finding and killing the processes that are associated with the server sessions is operating system specific. On some platforms the server sessions are not associated with any processes at all. Refer to your operating system specific documentation for more information.
You may sometimes need to kill an RMAN job that is hung in the media manager. The best way to terminate RMAN when the channel connections are hung in the media manager is to kill the session in the media manager. If this action does not solve the problem, then on some platforms, such as Unix, you may be able to kill the Oracle processes of the connections. (Note that killing the Oracle processes may cause problems from the media manager. See your media manager documentation for details.)
The nature of an RMAN session depends on the operating system. In UNIX, an RMAN session has the following processes associated with it:
The RMAN client process itself
The default channel, the initial connection to the target database
One target connection to the target database corresponding to each allocated channel
The catalog connection to the recovery catalog database, if you use a recovery catalog
An auxiliary connection to an auxiliary instance, during DUPLICATE
or TSPITR operations
A polling connection to the target database, used for monitoring RMAN command execution on the various allocated channels. By default, RMAN makes one polling connection. RMAN makes additional polling connections if you use different connect strings in the ALLOCATE
CHANNEL
or CONFIGURE
CHANNEL
commands. One polling connection exists for each distinct connect string used in the ALLOCATE
CHANNEL
or CONFIGURE
CHANNEL
command.
RMAN usually hangs because one of the channel connections is waiting in the media manager code for a tape resource. The catalog connection and the default channel appear to hang, because they are waiting for RMAN to tell them what to do. Polling connections seem to be in an infinite loop while polling the RPC under the control of the RMAN process.
If you kill the RMAN process itself, then you also kill the catalog connection, the auxiliary connection, the default channel, and the polling connections. If target and auxiliary connections are not hung in the media manager code, they also terminate. If either the target connection or any of the auxiliary connections are executing in the media management layer, they will not terminate until the processes are manually killed at the operating system level.
Not all media managers can detect the termination of the Oracle process. Those which cannot may keep resources busy or continue processing. Consult your media manager documentation for details.
Terminating the catalog connection does not cause the RMAN process to terminate because RMAN is not performing catalog operations while the backup or restore is in progress. Removing default channel and polling connections causes the RMAN process to detect that one of the channels has died and then proceed to exit. In this case, the connections to the hung channels remain active as described previously.
Once the hung channels in the media manager code are killed, the RMAN process detects this termination and proceed to exit, removing all connections except target connections that are still operative in the media management layer. The warning about the media manager resources still applies in this case.
To terminate an Oracle process that is hung in the media manager:
Query V$SESSION
and V$SESSION_WAIT
as described in "Monitoring RMAN Through V$ Views". For example, execute the following query:
COLUMN EVENT FORMAT a10 COLUMN SECONDS_IN_WAIT FORMAT 999 COLUMN STATE FORMAT a20 COLUMN CLIENT_INFO FORMAT a30 SELECT p.SPID, EVENT, SECONDS_IN_WAIT AS SEC_WAIT, sw.STATE, CLIENT_INFO FROM V$SESSION_WAIT sw, V$SESSION s, V$PROCESS p WHERE sw.EVENT LIKE 'sbt%' AND s.SID=sw.SID AND s.PADDR=p.ADDR ;
Examine the SQL output to determine which sbt functions are waiting. For example, the output may be as follows:
SPID EVENT SEC_WAIT STATE CLIENT_INFO ---- ---------- ---------- -------------------- ------------- 8642 sbtwrite2 600 WAITING rman channel=ORA_SBT_TAPE_1 8374 sbtwrite2 600 WAITING rman channel=ORA_SBT_TAPE_2
Using operating system-level tools appropriate to your platform, kill the hung sessions. For example, on Solaris execute a kill
-9
command:
% kill -9 8642 8374
On Windows, there is a command-line utility called ORAKILL
which lets you kill a specific thread in this situation. From a command prompt, run the following command:
orakill sid thread_id
where sid
identifies the database instance to target, and the thread_id
is the SPID
value from the query in step 1.
Check that the media manager also clears its processes. If any remain, the next backup or restore operation may hang again, due to the previous hang. In some media managers, the only solution is to shut down and restart the media manager. If the documentation from the media manager does not provide the needed information, contact technical support for the media manager.
See Also:
Your operating system specific documentation for the relevant commands