Oracle9i Application Server Administrator's Guide Release 2 (9.0.2) Part Number A92171-02 |
|
This chapter provides information about high availability within Oracle9i Application Server.
It contains the following topics:
The availability of a system or any component in that system is defined by the percentage of time that it works normally. A system works normally when it meets its correctness and performance specifications. For example, a system that works normally for twelve hours per day is 50% available. A system that has 99% availability is down 3.65 days per year on average. System administrators can expect critical systems to have 99.99% or even 99.999% availability. This means that the systems experience as little as four to five minutes of downtime per year.
Availability may not be constant over time. For example, availability may be higher during the daytime when most transactions occur, and lower during the night and on weekends. However, because the Internet provides a global set of users, it is a common requirement that systems are always available.
Redundant components can improve availability, but only if a spare component takes over immediately for a failed component. If it takes ten minutes to detect a component failure and twenty additional minutes to start the spare component, then the system experiences a 50% reduction in availability for that hour of service.
Oracle9iAS is designed to provide maximum system availability during many types of hardware and software failures.
Oracle9i Application Server keeps your system available by providing the following benefits:
At the mod_oc4
j-OC4J level, after the failure of a stateless service, the application server routes requests to alternate instances of the service in a similar fashion to connection rerouting. After a failure of a stateful service, the application server reroutes the request to an alternate instance to which the state has been replicated.
At application server instance level, you can use a hardware load balancer, Oracle9iAS Web Cache, or a DNS round robin strategy to load balance requests across redundant application server instances. This allows your Web site to continue functioning even if one of the application server instances goes down.
See Also:
For instructions on configuring Oracle9iAS Web Cache as a load balancer to route requests, refer to Oracle9iAS Web Cache Administration and Deployment Guide. |
Oracle9iAS Web Cache supports the following content-aware load balancing and failover detection features:
auto-restart
process that automatically restarts the cache.
For instance, certain viruses or attacks attempt to bring down sites by flooding servers with malicious requests, some of which are intended to exploit weaknesses in the application Web server. Because attacks are typically distributed, it is often difficult to block all the IPs that are sending the requests. However, by caching the responses to the initial requests, Oracle9iAS Web Cache can shield the backend servers from having to respond to subsequent requests of this nature. In other words, caching provides some level of denial-of-service protection.
Figure 13-1 shows an example of high availability architecture where an Oracle9iAS instance can contain one Oracle HTTP Server instance and zero or more OC4J processes. Here, Oracle9iAS Web Cache is used as a load balancer to route requests. OPMN within the Oracle9iAS instance environment monitors the Oracle HTTP Server and OC4J processes. If either Oracle HTTP Server or OC4J fail, the OPMN restarts the failed process. This enables a maximum level of availability of the components within an Oracle9iAS instance.
The mod_oc4j
module routes requests from the Oracle HTTP Server to Oracle Containers for J2EE (OC4J). It works in conjunction with OPMN to keep its routing table updated so that it load balances across only live OC4J processes in an OC4J instance. When a request comes in, mod_oc4j
selects a process from its shared memory table. If a session is created with this request, related requests are routed to the same process. Otherwise, a new process is selected based on the load balancing algorithm. To allow fault tolerance, this module supports OC4J instances and islands. To handle failover with stateless requests, it tries to select an alternative OC4J process within the same island and routes the request to that process. If all the OC4J processes in the instance have been tried and none of them are successful, mod_oc4j
returns request failure. For session enabled requests, the module tries to failover to another OC4J process within the same island as the original process that serviced the request. Again, if all the OC4J processes in the island are unable to service the request, mod_oc4j
returns request failure.
Oracle Process Management and Notification Service (OPMN) manages the processes within an Oracle9iAS instance and also channels all notifications from different components instances to all interested in receiving them.
OPMN consists of the following two components:
Oracle Process Manager (PM) is the centralized process management mechanism in Oracle9iAS and is used to manage all Oracle HTTP Server and OC4J related processes. It starts, stops, restarts, and detects death of these processes. When these processes are configured to start up, the characteristics of each set is specified in a configuration file called opmn.xml
. The Oracle Enterprise Manager Web site also uses PM to manage processes.
The PM starts and then waits for a command to start specific or all processes. Similarly, at shutdown, the PM receives a request to stop one or more processes, or all processes and itself as specified by the HTTP request parameters.
Oracle Notification System (ONS) is the transport mechanism for failure, recovery, startup, and other related notifications between components in Oracle9iAS. It operates according to a subscriber-publisher model, wherein any component that wishes to receive a notification of a certain type subscribes to the ONS. When such a notification is published, ONS sends it to all subscribers. PM and ONS exist within the same process, and comprise Oracle Process Management and Notification. OPMN is monitored by a shadow process that restarted upon request or after a catastrophic failure
The Oracle Process Manager uses a configuration file, opmn.xml
, that contains information about all the processes it manages. This information includes parameters to start a process as well as metadata for bookkeeping. Each process type that OPMN manages has a corresponding element defined in opmn.xml
. This section discusses the following elements:
This is the topmost element for the process manager. The administrative process defines a single process manager. A process-manager
element contains elements of types:
ohs,oc4j,custom
: Specify managed processes.
log-file
: Specifies the location of the process manager log file. It is optional. The default is ORACLE_HOME/opmn/logs/ipm.log
.
The log-file
element can have one attribute: level (optional). The level attribute can be one of 1=FATAL, 2=ERROR, 3=WARN, 4=NOTIFY, 5=DEBUG, 6=VERBOSE
. The default is 3
(warning, error, and fatal messages are logged).
This element defines an Oracle HTTP Server process and has the following attributes:
gid
: A unique identifier for the process. It is optional. The default is "HTTP Server
".
maxRetry
: Specifies the number of times PM should check the Oracle HTTP Server process before declaring it dead, and the number of times to try to start an Oracle HTTP Server process before giving up. It is optional. The default value is 3.
timeout
: Specifies the maximum time needed to wait for Oracle HTTP Server to completely start. It is optional. The default value is dependent on system load, but is at least 300 seconds.
The ohs
element can contain optional elements of the following types:
config-file
: Specifies the server configuration file. It is optional. The default value is ORACLE_HOME
/Apache/Apache/conf/httpd.conf
.
The config-file
element can have these attributes:
path
: The path to the configuration file. If you specify a relative path, then the current Oracle HTTP Server working directory is prepended to the path you specify.
start-mode
: Specifies the mode in which the server starts:
ssl or non-ssl. It is optional. The default is non-ssl.
The start-mode element can have the mode attribute. Specify ssl to start Oracle HTTP Server in SSL mode.
This element defines an OC4J process group and has the following attributes:
gid
: Unique identifier for the OC4J process group. It is optional. The default is home
.
numProcs
: Number of OC4J processes to start in this OC4J process. It is optional. This value is ignored if the island element is specified for this OC4J process. The default is 1
.
maxRetry
: Number of times PM checks OC4J before declaring it dead, and the number of times to try to start it before giving up. It is optional. The default is 3
.
instanceName
: Defines a group of OC4J processes, or islands, that service the same set of applications. It is optional. The default is home
.
timeout
: Specifies the maximum time needed to wait for Oracle HTTP Server to completely start. It is optional. Default value is dependent on system load, but is at least 300 seconds.
The oc4j
element can contain elements of the following types:
config-file
: Typically, specifies the server.xml
file if it is in a non-default location. It is required. If a relative path is specified, the current OC4J working directory is prepended to the path.
java-bin
: Specifies the path to the Java executable that starts OC4J. It is optional. The default is ORACLE_HOME/jdk/bin/java
.
java-option
: Specifies JVM options required by the OC4J command line. It is optional.
oc4j-option
: Specifies OC4J options required by the OC4J command line. It is optional.
port
: Specifies the individual port or range of ports on which OC4J listens. It is optional if numProcs
has not been specified or is equal to 1, required if numProcs is greater than 1.
The port
element requires the following attributes:
AJP
: AJP ports on which the OC4J process listens. If the value is set to 0, then OC4J picks the AJP port.
JMS
: JMS ports on which the OC4J process listens.
RMI
: RMI ports on which the OC4J process listens.
For example:
<port ajp="8000,8010-8020,8003" jms="7030-7050,7780-7790" rmi="9000,9001,9002,9003,9004" />
island
: A set of JVMs sharing session state within an OC4J process. For example, if a session is being handled by OC4J process 1 in island 1 of specifying this element and process 1 dies, mod_oc4j
fails the session over to OC4J process 2, which is also a member of island 1.
To start multiple islands within an OC4J process, you must specify an island element for each island. In this case the numProcs
attribute, if specified, for OC4J is ignored. If you want to start only one island within an OC4J process, you can specify numProcs
either as an OC4J attribute or as an island attribute.
The island element requires the following attributes:
id
: Unique identifier for the island.
numProcs
: Number of processes to start for this island and OC4J process.
For example:
<island id="island1" numProcs="2"/> <island id="island2" numProcs="3"/>
The environment element can have the following element type:
prop
: Represents one environment name value pair for the custom process.
The prop
element can have the following attributes:
This section contains examples of OC4J configurations.
<oc4j instanceName="oc4jInstance1"> <config-file path="/oracle/j2ee/home/config/server.xml" /> <port ajp="6666-6670" rmi="6686-6690" jms="6786-6790"/> <island id="island" numProcs="4"/> </oc4j>
or
<oc4j instanceName="oc4jInstance1" numProcs="4"> <config-file path="/oracle/j2ee/home/config/server.xml" /> <port ajp="6666-6670" rmi="6686-6690" jms="6786-6790"/> </oc4j>
These two configurations achieve the same results except that the second one uses default_island
as its OC4J island ID.
< <oc4j instanceName="oc4jInstance1"> <config-file path=/oracle/j2ee/home/config/server.xml /> <port ajp="6666-6670" rmi="6686-6690" jms="6786-6790"/> <island id="island1" numProcs="2"/> <island id="island2" numProcs="2"/> </oc4j>
or
<oc4j instanceName="oc4jInstance1" numProcs="4"> <config-file path=/oracle/j2ee/home/config/server.xml /> <port ajp="6666-6670" rmi="6686-6690" jms="6786-6790"/> <island id="island1" numProcs="2"/> <island id="island2" numProcs="2"/> </oc4j>
These two configurations achieve the same result. Since the island element is specified, the OC4J attribute numProcs
is ignored.
This element defines a generic process group and has the following attributes:
gid
: Unique identifier for this custom process group. It is optional. The default is default_gid1
.
num_of_proc
: Number of processes to start in this group. It is optional. The default is 1
.
maxRetry
: Specifies the number of times to try to start the process before giving up. It is optional. The default is 3
.
The custom
element can contain elements of type:
start
: Specifies the absolute path to the command used to start the process It is required.
environment
: Specifies the environment required by the process when it is spawned.
The environment element can have one of the following element types:
prop
: Represents one environment name value pair for the custom process.
The prop
element can have the following attributes:
This section provides some sample configurations for the Oracle Process Manager. Electronic copies of these are available at
http://otn.oracle.com
This configuration starts one Oracle HTTP Server and one OC4J process with default values.
<notification-server> <port local="6001" remote="6002"> </port> <log-file path="/private/my_directory/tmp/opmn_logs/ons.log" level="5"> </log-file> </notification-server> <process-manager> <ohs/> <oc4j> <config-file path=ORACLE_HOME/j2ee/home/config/server.xml /> </oc4j> <log-file path="/private/my_directory/tmp/opmn_logs/ipm.log" level="4"> </log-file> </process-manager>
This configuration starts one Oracle HTTP Server process, two OC4J processes, and one generic process with several user-specified values.
<process-manager> <ohs gid="a1" maxRetry="3"> <config-file path="/my_directory/conf/httpd.conf"/> </ohs> <oc4j gid="o1" numProcs="2" maxRetry="4" clusterID="myClusterA" islandID="myIslandA"> <config-file path="/my_directory/conf/oc4j.xml"/> <port http="8401" rmi="8600" jms="6577" ajp="3456"/> </oc4j> <custom gid="g1" num_of_proc = "1"> <start path="/my_directory/bin/exec1"/> <environment> <prop name="PATH" value="/my_directory/ias/lib"/> <prop name="CLASSPATH" value="/my_directory/ias/bin" </environment> </general> </process-manager>
This configuration starts one Oracle HTTP Server and two OC4J processes with some user-specified values.
<notification-server> <port local="6001" remote="6002"> </port> <log-file path="/private/my_directory/tmp/opmn_logs/ons.log" level="5"> </log-file> </notification-server> <process-manager> <!-- Start one ohs process with a process group ID of a1, with a config file in a non-default directory location -- > <ohs gid="a1" maxRetry="3"> <config-file path="Apache/Apache/my_conf/httpd.conf"/> </ohs> <!-- Start two oc4j processes with a process group ID of o1, with a instanceName of myClusterA (This should be the worker specified in the mod_oc4j config file), and an islandID of myIslandA. Since we are starting two processes we have to specify the port range used when starting the processes. So OC4J process1 starts with ajp port 8010, jms port 8020 and rmi port 8030. The second OC4J process uses ajp port 8011, jms port 8021, and rmi port 8031. --> <oc4j gid="o1" numProcs="2" instanceName="myInstanceA" islandID="myIslandA"> <port ajp="8010-8012", jms="8020-8022" rmi="8030-8032"/> </oc4j> <!-- If the logs should be created in a specific directory provide that location here --!> <log-file path="/tmp/ipm.log" level="4"/> </process-manager>
The Oracle Notification System (ONS) uses a configuration file, opmn.xml
. This section discusses the following elements in opmn.xml
, under the "notification-server" section:
This specifies the ports to which ONS binds and listens for notifications or the start, stop, restart, and statistical dump requests.
local
: Bound to "localhost", this port is required and is used by all ONS clients on this host (including Oracle HTTP Server and OC4J).
remote
: Bound to the current host, this optional port is used for communication with other OPMN servers.
request
: Bound to the current host, this optional port is used for statistical dump requests only. This port is required by the DCM layer, which is the layer about OPMN.
This specifies SSL information for the ONS remote port. It is optional.
enabled
: Set to "true" if ssl is enabled, otherwise "false". It is required.
wallet-file
: Specifies the path of the wallet file to use. It is optional.
wallet-password
: Specifies the password for the given wallet-file. It is required if wallet-file is specified.
This specifies the location of the notification manager log file. It is optional. Default is ORACLE_HOME
/opmn/logs/ons.log
.
The log-file
element can have one attribute: level. It is optional. The level attribute can be one of 1=FATAL, 2=ERROR, 3=WARN, 4=NOTIFY, 5=DEBUG, 6=VERBOSE
. The default is 3
(warning, error, and fatal messages are logged).
This section provides some sample configurations for the Oracle Notification System.
<notification-server> <port local ="6100" request="6300"/> <log-file path="/ora/opmn/logs/ons.log" level="4"/> </notification-server>
<notification-server> <port local ="6100" remote="6200" request="6300"/> <ssl enabled="true" wallet-file="/ora/wal" wallet-password="foo"> <log-file path="/ora/opmn/logs/ons.log" level="4"/> </notification-server>
This section discusses the common failure scenarios and explains the response from the high availability infrastructure in Oracle9i Application Server. Topics include:
Figure 13-2 depicts a scenario where Oracle HTTP Server goes down.
Figure 13-3 depicts a scenario where OC4J goes down.
For non-J2EE applications, such as PL/SQL or Perl, there is no impact at all.
Since OPMN does not play a role in actively servicing requests, this is not a critical failure. However, since it is critical in ensuring system-wide updates, its failure recovery is important to ensuring a healthy steady state.
Figure 13-4 depicts a scenario where OPMN goes down.
mod_oc4j
, might get stale while OPMN is down, resulting in less optimal routing.
|
Copyright © 2002 Oracle Corporation. All Rights Reserved. |
|