Linux » Books » Administrative »
Linux FailSafe Administrator's Guide
(document number: 007-4322-002 / published: 2001-02-28)
table of contents | additional info | download find in page
A resource is a single
physical or logical entity that provides a service to clients or other resources.
A resource is generally available for use on two or more nodes in a cluster,
although only one node controls the resource at any given time. For example,
a resource can be a single disk volume, a particular network address, or an
application such as a web node. Resources are identified by a resource name
and a resource type. A resource name identifies a specific
instance of a resource type. A resource type is a particular
class of resource. All of the resources in a given resource type can be handled
in the same way for the purposes of failover. Every resource is an instance
of exactly one resource type. A resource type is identified with a simple name. A resource type can
be defined for a specific logical node, or it can be defined for an entire
cluster. A resource type that is defined for a node will override a clusterwide
resource type definition of the same name; this allows an individual node
to override global settings from a clusterwide resource type definition. The Linux FailSafe software includes many predefined resource types.
If these types fit the application you want to make into a highly available
service, you can reuse them. If none fit, you can define additional resource
types. To define a resource, you provide the following information: The name of the resource to define, with a maximum length
of 255 characters. The type of resource to define. The Linux FailSafe system
contains some pre-defined resource types (template and IP_Address). You can define your own resource type as well. The name of the cluster that contains the resource. The logical name of the node that contains the resource (optional).
If you specify a node, a local version of the resource will be defined on
that node. Resource type-specific attributes for the resource. Each resource
type may require specific parameters to define for the resource, as described
in the following subsections.
You can define up to 100 resources in a Linux FailSafe configuration. The IP Address resources
are the IP addresses used by clients to access the highly available services
within the resource group. These IP addresses are moved from one node to another
along with the other resources in the resource group when a failure is detected. You specify the resource name of an IP address in dotted decimal notation.
IP names that require name resolution should not be used. For example, 192.26.50.1
is a valid resource name of the IP Address resource type. The IP address you define as a Linux FailSafe resource must not be the
same as the IP address of a node hostname or the IP address of a node's control
network. When you define an IP address, you can optionally specifying the following
parameters. If you specify any of these parameters, you must specify all of
them. The broadcast address for the IP address. The network mask of the IP address. A comma-separated list of interfaces on which the IP address
can be configured. This ordered list is a superset of all the interfaces on
all nodes where this IP address might be allocated. Hence, in a mixed cluster
with different ethernet drivers, an IP address might be placed on eth0 on
one system and ln0 on a another. In this case the interfaces
field would be eth0,ln0 or ln0,eth0. The order of the list of interfaces determines the priority order for
determining which IP address will be used for local restarts of the node.
One resource can be dependent on one or more other
resources; if so, it will not be able to start (that is, be made available
for use) unless the dependent resources are started as well. Dependent resources
must be part of the same resource group. Like resources, a resource type can be dependent on one or more other
resource types. If such a dependency exists, at least one instance of each
of the dependent resource types must be defined. For example, a resource type
named Netscape_web might have resource type dependencies
on a resource types named IP_address and volume. If a resource named ws1 is defined with the Netscape_web resource type, then the resource group containing ws1 must also contain at least one resource of the type IP_address and one resource of the type volume. You cannot make resources mutually dependent. For example, if resource
A is dependent on resource B, then you cannot make resource B dependent on
resource A. In addition, you cannot define cyclic dependencies. For example,
if resource A is dependent on resource B, and resource B is dependent on resource
C, then resource C cannot be dependent on resource A. When you add a dependency to a resource definition, you provide the
following information: The name of the existing resource to which you are adding
a dependency. The resource type of the existing resource to which you are
adding a dependency. The name of the cluster that contains the resource. Optionally, the logical node name of the node in the cluster
that contains the resource. If specified, resource dependencies are added
to the node's definition of the resource. If this is not specified, resource
dependencies are added to the cluster-wide resource definition. The resource name of the resource dependency. The resource type of the resource dependency.
To define a resource with the Cluster Manager GUI, perform the following
steps: Launch the FailSafe Manager. On the left side of the display, click on the “Resources
& Resource Types” category. On the right side of the display click on the “Define
a New Resource” task link to launch the task. Enter the selected inputs. Click on “OK” at the bottom of the screen to complete
the task. On the right side of the display, click on the “Add/Remove
Dependencies for a Resource Definition” to launch the task. Enter the selected inputs. Click on “OK” at the bottom of the screen to complete
the task.
When you use this command to define a resource, you define a cluster-wide
resource that is not specific to a node. For information on defining a node-specific
resource, see Section 5.5.3. Use the following CLI command to define a clusterwide resource: cmgr> define resource A [of resource_type B] [in cluster C] |
Entering this command specifies the name and resource type of the resource
you are defining within a specified cluster. If you have specified a default
cluster or a default resource type, you do not need to specify a resource
type or a cluster in this command and the CLI will use the default. When you use this command to define a resource, you define a clusterwide
resource that is not specific to a node. For information on defining a node-specific
resource, see Section 5.5.3. The following prompt appears: When this prompt appears during resource creation, you can enter the
following commands to specify the attributes of the resource you are defining
and to add and remove dependencies from the resource: resource A? set key to value
resource A? add dependency E of type F
resource A? remove dependency E of type F |
The attributes you define with the set key to value
command will depend on the type of resource you are defining, as described
in Section 5.5.1. For detailed information on how to determine the format for defining
resource attributes, see Section 5.5.2.3. When you are finished defining the resource and its dependencies, enter done to return to the cmgr prompt. To see the format in which you can specify the user-specific attributes
that you need to set for a particular resource type, you can enter the following
command to see the full definition of that resource type: cmgr> show resource_type A in cluster B |
For example, to see the key attributes you
define for a resource of a defined resource type IP_address,
you would enter the following command: cmgr> show resource_type IP_address in cluster nfs-cluster
Name: IP_address
Predefined: true
Order: 401
Restart mode: 1
Restart count: 2
Action name: stop
Executable: /usr/lib/failsafe/resource_types/IP_address/stop
Maximum execution time: 80000ms
Monitoring interval: 0ms
Start monitoring time: 0ms
Action name: exclusive
Executable: /usr/lib/failsafe/resource_types/IP_address/exclusive
Maximum execution time: 100000ms
Monitoring interval: 0ms
Start monitoring time: 0ms
Action name: start
Executable: /usr/lib/failsafe/resource_types/IP_address/start
Maximum execution time: 80000ms
Monitoring interval: 0ms
Start monitoring time: 0ms
Action name: restart
Executable: /usr/lib/failsafe/resource_types/IP_address/restart
Maximum execution time: 80000ms
Monitoring interval: 0ms
Start monitoring time: 0ms
Action name: monitor
Executable: /usr/lib/failsafe/resource_types/IP_address/monitor
Maximum execution time: 40000ms
Monitoring interval: 20000ms
Start monitoring time: 50000ms
Type specific attribute: NetworkMask
Data type: string
Type specific attribute: interfaces
Data type: string
Type specific attribute: BroadcastAddress
Data type: string
No resource type dependencies |
The display reflects the format in which you can specify the group id,
the device owner, and the device file permissions for the volume. In this
case, the devname-group key specifies the group id of
the device file, the devname_owner key specifies the
owner of the device file, and the devname_mode key specifies
the device file permissions. For example, to set the group id to sys, enter
the following command: resource A? set devname-group to sys |
This remainder of this section summarizes the attributes you specify
for the predefined Linux FailSafe resource types with the set
key to value command of the Cluster Manger CLI. When you define an
IP address, you specify the following attributes: - NetworkMask
The subnet mask of the IP address - interfaces
A comma-separated list of interfaces on which the IP address can be
configured - BroadcastAddress
The broadcast address for the IP address
You can redefine an existing resource with a
resource definition that applies only to a particular node. Only existing
clusterwide resources can be redefined; resources already defined for a specific
cluster node cannot be redefined. You
use this feature when you configure heterogeneous clusters for an IP_address resource. For example, IP_address
192.26.50.2 can be configured on et0 on an SGI Challenge node and on eth0
on all other Linux servers. The clusterwide resource definition for 192.26.50.2 will have
the interfaces field set to eth0 and the node-specific
definition for the Challenge node will have et0 as the interfaces field. Using the Cluster Manager GUI, you can take an existing clusterwide
resource definition and redefine it for use on a specific node in the cluster: Launch the FailSafe Manager. On the left side of the display, click on the “Resources
& Resource Types” category. On the right side of the display click on the “Redefine
a Resource For a Specific Node” task link to launch the task. Enter the selected inputs. Click on “OK” at the bottom of the screen to complete
the task.
You can use the Cluster Manager CLI to redefine a clusterwide resource
to be specific to a node just as you define a clusterwide resource, except
that you specify a node on the define resource command. Use the following CLI command to define a node-specific resource: cmgr> define resource A of resource_type B on node C [in cluster D] |
If you have specified a default cluster, you do not need to specify
a cluster in this command and the CLI will use the default. After you have defined resources,
you can modify and delete them. You can modify only the type-specific attributes for a resource. You
cannot rename a resource once it has been defined. Note: There are some resource attributes whose modification does not take
effect until the resource group containing that resource is brought online
again. For example, if you modify the export options of a resource of type
NFS, the modifications do not take effect immediately; they take effect when
the resource is brought online.
To modify a resource with the Cluster Manager GUI, perform the following
procedure: Launch the FailSafe Manager. On the left side of the display, click on the “Resources
& Resource Types” category. On the right side of the display click on the “Modify
a Resource Definition” task link to launch the task. Enter the selected inputs. Click on “OK” at the bottom of the screen to complete
the task, or click on “Cancel” to cancel.
To delete a resource with the Cluster Manager GUI, perform the following
procedure: Launch the FailSafe Manager. On the left side of the display, click on the “Resources
& Resource Types” category. On the right side of the display click on the “Delete
a Resource” task link to launch the task. Enter the selected inputs. Click on “OK” at the bottom of the screen to complete
the task, or click on “Cancel” to cancel.
Use the following CLI command to modify a resource: cmgr> modify resource A of resource_type B [in cluster C] |
Entering this command specifies the name and resource type of the resource
you are modifying within a specified cluster. If you have specified a default
cluster, you do not need to specify a cluster in this command and the CLI
will use the default. You modify a resource using the same commands you use to define a resource. You can use the following command to delete a resource definition: cmgr> delete resource A of resource_type B [in cluster D] |
You can display resources in various ways. You can
display the attributes of a particular defined resource, you can display all
of the defined resources in a specified resource group, or you can display
all the defined resources of a specified resource type. The Cluster Manager GUI provides a convenient display of resources through
the FailSafe Cluster View. You can launch the FailSafe Cluster View directly,
or you can bring it up at any time by clicking on the “FailSafe Cluster
View” button at the bottom of the “FailSafe Manager” display. From the View menu of the FailSafe Cluster View, select Resources to
see all defined resources. The status of these resources will be shown in
the icon (green indicates online, grey indicates offline). Alternately, you
can select “Resources of Type” from the View menu to see resources
organized by resource type, or you can select “Resources by Group”
to see resources organized by resource group. Use the following command to view the parameters of a defined resource: cmgr> show resource A of resource_type B |
Use the following command to view all of the defined resources in a
resource group: cmgr> show resources in resource_group A [in cluster B] |
If you have specified a default cluster, you do not need to specify
a cluster in this command and the CLI will use the default. Use the following command to view all of the defined resources of a
particular resource type in a specified cluster: cmgr> show resources of resource_type A [in cluster B] |
If you have specified a default cluster, you do not need to specify
a cluster in this command and the CLI will use the default. The Linux FailSafe software includes many
predefined resource types. If these types fit the application you want to
make into a highly available service, you can reuse them. If none fits, you
can define additional resource types. Complete information on defining resource types is provided in the Linux FailSafe Programmer's Guide. This manual provides a summary
of that information. To define a new resource type, you must have the following information: Name of the resource type, with a maximum length of 255 characters. Name of the cluster to which the resource type will apply. Node on which the resource type will apply, if the resource
type is to be restricted to a specific node. Order of performing the action scripts for resources of this
type in relation to resources of other types: Resources are started in the increasing order of this value Resources are stopped in the decreasing order of this value See the Linux FailSafe Programmer's Guide for
a full description of the order ranges available.
Restart mode, which can be one of the following values: Number of local restarts (when restart mode is 1). Location of the executable script. This is always /usr/lib/failsafe/resources_types/rtname,
where rtname is the resource type name. Monitoring interval, which is the time period (in milliseconds)
between successive executions of the monitor action script;
this is only valid for the monitor action script. Starting time for monitoring. When the resource group is made
in online in a cluster node, Linux FailSafe will start monitoring the resources
after the specified time period (in milliseconds). Action scripts to be defined for this resource type, You must
specify scripts for start, stop, exclusive, and monitor, although the monitor script may contain only a return-success function if you
wish. If you specify 1 for the restart mode, you must specify a restart script. Type-specific attributes to be defined for this resource type.
The action scripts use this information to start, stop, and monitor a resource
of this resource type. For example, NFS requires the following resource keys: export-point, which takes a value that
defines the export disk name. This name is used as input to the exportfs command. For example: export-point = /this_disk |
export-info, which takes a value that
defines the export options for the filesystem. These options are used in the exportfs command. For example: export-info = rw,sync,no_root_squash |
filesystem, which takes a value that
defines the raw filesystem. This name is used as input to the mount) command. For example:
To define a new resource type, you use the Cluster Manager GUI or the
Cluster Manager CLI. To define a resource type with the Cluster Manager GUI, perform the
following steps: Launch the FailSafe Manager. On the left side of the display, click on the “Resources
& Resource Types” category. On the right side of the display click on the “Define
a Resource Type” task link to launch the task. Enter the selected inputs. Click on “OK” at the bottom of the screen to complete
the task.
The following steps show the use of cluster_mgr interactively
to define a resource type called test_rt. Log in as root. Execute the cluster_mgr command using the -p option to prompt you for information (the command name can be
abbreviated to cmgr): # /usr/lib/failsafe/bin/cluster_mgr -p
Welcome to Linux FailSafe Cluster Manager Command-Line Interface
cmgr> |
Use the set subcommand to specify the default
cluster used for cluster_mgr operations. In this example,
we use a cluster named test: Note: If you prefer, you can specify the cluster name as needed with each
subcommand.
Use the define resource_type subcommand.
By default, the resource type will apply across the cluster; if you wish to
limit the resource_type to a specific node, enter the node name when prompted.
If you wish to enable restart mode, enter 1 when prompted. Note: The following example only shows the prompts and answers for two
action scripts (start and stop) for
a new resource type named test_rt.
cmgr> define resource_type test_rt
(Enter "cancel" at any time to abort)
Node[optional]?
Order ? 300
Restart Mode ? (0)
DEFINE RESOURCE TYPE OPTIONS
0) Modify Action Script.
1) Add Action Script.
2) Remove Action Script.
3) Add Type Specific Attribute.
4) Remove Type Specific Attribute.
5) Add Dependency.
6) Remove Dependency.
7) Show Current Information.
8) Cancel. (Aborts command)
9) Done. (Exits and runs command)
Enter option:1
No current resource type actions
Action name ? start
Executable Time? 40000
Monitoring Interval? 0
Start Monitoring Time? 0
0) Modify Action Script.
1) Add Action Script.
2) Remove Action Script.
3) Add Type Specific Attribute.
4) Remove Type Specific Attribute.
5) Add Dependency.
6) Remove Dependency.
7) Show Current Information.
8) Cancel. (Aborts command)
9) Done. (Exits and runs command)
Enter option:1
Current resource type actions:
Action - 1: start
Action name stop
Executable Time? 40000
Monitoring Interval? 0
Start Monitoring Time? 0
0) Modify Action Script.
1) Add Action Script.
2) Remove Action Script.
3) Add Type Specific Attribute.
4) Remove Type Specific Attribute.
5) Add Dependency.
6) Remove Dependency.
7) Show Current Information.
8) Cancel. (Aborts command)
9) Done. (Exits and runs command)
Enter option:3
No current type specific attributes
Type Specific Attribute ? integer-att
Datatype ? integer
Default value[optional] ? 33
0) Modify Action Script.
1) Add Action Script.
2) Remove Action Script.
3) Add Type Specific Attribute.
4) Remove Type Specific Attribute.
5) Add Dependency.
6) Remove Dependency.
7) Show Current Information.
8) Cancel. (Aborts command)
9) Done. (Exits and runs command)
Enter option:3
Current type specific attributes:
Type Specific Attribute - 1: export-point
Type Specific Attribute ? string-att
Datatype ? string
Default value[optional] ? rw
0) Modify Action Script.
1) Add Action Script.
2) Remove Action Script.
3) Add Type Specific Attribute.
4) Remove Type Specific Attribute.
5) Add Dependency.
6) Remove Dependency.
7) Show Current Information.
8) Cancel. (Aborts command)
9) Done. (Exits and runs command)Enter option:5
No current resource type dependencies
Dependency name ? filesystem
0) Modify Action Script.
1) Add Action Script.
2) Remove Action Script.
3) Add Type Specific Attribute.
4) Remove Type Specific Attribute.
5) Add Dependency.
6) Remove Dependency.
7) Show Current Information.
8) Cancel. (Aborts command)
9) Done. (Exits and runs command)
Enter option:7
Current resource type actions:
Action - 1: start
Action - 2: stop
Current type specific attributes:
Type Specific Attribute - 1: integer-att
Type Specific Attribute - 2: string-att
No current resource type dependencies
Resource dependencies to be added:
Resource dependency - 1: filesystem
0) Modify Action Script.
1) Add Action Script.
2) Remove Action Script.
3) Add Type Specific Attribute.
4) Remove Type Specific Attribute.
5) Add Dependency.
6) Remove Dependency.
7) Show Current Information.
8) Cancel. (Aborts command)
9) Done. (Exits and runs command)
Enter option:9
Successfully created resource_type test_rt
cmgr> show resource_types
NFS
template
Netscape_web
test_rt
statd
Oracle_DB
MAC_address
IP_address
INFORMIX_DB
filesystem
volume
cmgr> exit
# |
You can redefine an existing
resource type with a resource definition that applies only to a particular
node. Only existing clusterwide resource types can be redefined; resource
types already defined for a specific cluster node cannot be redefined. A resource type that is defined for a node overrides a
cluster-wide resource type definition with the same name; this allows an individual
node to override global settings from a clusterwide resource type definition.
You can use this feature if you want to have different script timeouts for
a node or you want to restart a resource on only one node in the cluster. For example, the IP_address resource has local restart
enabled by default. If you would like to have an IP address type without local
restart for a particular node, you can make a copy of the IP_address clusterwide resource type with all of the parameters the same except
for restart mode, which you set to 0. Using the Cluster Manager GUI, you can take an existing clusterwide
resource type definition and redefine it for use on a specific node in the
cluster. Perform the following tasks: Launch the FailSafe Manager. On the left side of the display, click on the “Resources
& Resource Types” category. On the right side of the display click on the “Redefine
a Resource Type For a Specific Node” task link to launch the task. Enter the selected inputs. Click on “OK” at the bottom of the screen to complete
the task.
With the Cluster Manager CLI, you redefine a node-specific resource
type just as you define a cluster-wide resource type, except that you specify
a node on the define resource_type command. Use the following CLI command to define a node-specific resource type: cmgr> define resource_type A on node B [in cluster C] |
If you have specified a default cluster, you do not need to specify
a cluster in this command and the CLI will use the default. Like resources, a resource type can be
dependent on one or more other resource types. If such a dependency exists,
at least one instance of each of the dependent resource types must be defined.
For example, a resource type named Netscape_web might have
resource type dependencies on a resource type named IP_address
and volume. If a resource named ws1
is defined with the Netscape_web resource type, then the
resource group containing ws1 must also contain at least
one resource of the type IP_address nd one resource of
the type volume. When using the Cluster Manager GUI, you add or remove dependencies for
a resource type by selecting the “Add/Remove Dependencies for a Resource
Type” from the “Resources & Resource Types” display
and providing the indicated input. When using the Cluster Manager CLI, you
add or remove dependencies when you define or modify the resource type. After you have defined
a resource types, you can modify and delete them. To modify a resource type with the Cluster Manager GUI, perform the
following procedure: Launch the FailSafe Manager. On the left side of the display, click on the “Resources
& Resource Types” category. On the right side of the display click on the “Modify
a Resource Type Definition” task link to launch the task. Enter the selected inputs. Click on “OK” at the bottom of the screen to complete
the task, or click on “Cancel” to cancel.
To delete a resource type with the Cluster Manager GUI, perform the
following procedure: Launch the FailSafe Manager. On the left side of the display, click on the “Resources
& Resource Types” category. On the right side of the display click on the “Delete
a Resource Type” task link to launch the task. Enter the selected inputs. Click on “OK” at the bottom of the screen to complete
the task, or click on “Cancel” to cancel.
Use the following CLI command to modify a resource: cmgr> modify resource_type A [in cluster B] |
Entering this command specifies the resource type you are modifying
within a specified cluster. If you have specified a default cluster, you do
not need to specify a cluster in this command and the CLI will use the default. You modify a resource type using the same commands you use to define
a resource type. You can use the following command to delete a resource type: cmgr> delete resource_type A [in cluster B] |
When you define a cluster, Linux
FailSafe installs a set of resource type definitions that you can use that
include default values. If you need to install additional standard Silicon
Graphics-supplied resource type definitions on the cluster, or if you delete
a standard resource type definition and wish to reinstall it, you can load
that resource type definition on the cluster. The resource type definition you are installing cannot exist on the
cluster. To install a resource type using the GUI, select the “Load a Resource”
task from the “Resources & Resource Types” task page and enter
the resource type to load. Use the following CLI command to install a resource type on a cluster: cmgr> install resource_type A [in cluster B] |
If you have specified a default cluster, you do not need to specify
a cluster in this command and the CLI will use the default. After you have defined a resource types,
you can display them. The Cluster Manager GUI provides a convenient display of resource types
through the FailSafe Cluster View. You can launch the FailSafe Cluster View
directly, or you can bring it up at any time by clicking on the “FailSafe
Cluster View” prompt at the bottom of the “FailSafe Manager”
display. From the View menu of the FailSafe Cluster View, select Types to see
all defined resource types. You can then click on any of the resource type
icons to view the parameters of the resource type. Use the following command to view the parameters of a defined resource
type in a specified cluster: cmgr> show resource_type A [in cluster B] |
If you have specified a default cluster, you do not need to specify
a cluster in this command and the CLI will use the default. Use the following command to view all of the defined resource types
in a cluster: cmgr> show resource_types [in cluster A] |
If you have specified a default cluster, you do not need to specify
a cluster in this command and the CLI will use the default. Use the following command to view all of the defined resource types
that have been installed: cmgr> show resource_types installed |
Before you can configure your resources into a resource group, you must
determine which failover policy to apply to the resource group. To define
a failover policy, you provide the following information: The name of the failover policy, with a maximum length of
63 characters, which must be unique within the pool. The name of an existing failover script. The initial failover domain, which is an ordered list of the
nodes on which the resource group may execute. The administrator supplies
the initial failover domain when configuring the failover policy; this is
input to the failover script, which generates the runtime failover domain. The failover attributes, which modify the behavior of the
failover script.
Complete information on failover policies and failover scripts, with
an emphasis on writing your own failover policies and scripts, is provided
in the Linux FailSafe Programmer's Guide. A failover script
helps determine the node that is chosen for a failed resource group. The failover
script takes the initial failover domain and transforms it into the runtime
failover domain. Depending upon the contents of the script, the initial and
the runtime domains may be identical. The ordered failover script is provided with the
Linux FailSafe release. The ordered script never changes
the initial domain; when using this script, the initial and runtime domains
are equivalent. The round-robin failover script is also provided
with the Linux FailSafe release. The round-robin cript
selects the resource group owner in a round-robin (circular) fashion. This
policy can be used for resource groups that can be run in any node in the
cluster. Failover scripts are stored in the /usr/lib/failsafe/policies directory. If the ordered script does not meet
your needs, you can define a new failover script and place it in the /usr/lib/failsafe/policies directory. When you are using the FailSafe
GUI, the GUI automatically detects your script and presents it to you as a
choice for you to use. You can configure the Linux FailSafe database to use
your new failover script for the required resource groups. For information
on defining failover scripts, see the Linux FailSafe Programmer's
Guide. A failover domain is the ordered list of nodes
on which a given resource group can be allocated. The nodes listed in the
failover domain must be within the same cluster; however, the failover domain
does not have to include every node in the cluster. The failover domain can
be used to statically load balance the resource groups in a cluster. Examples: In a four-node cluster, two nodes might share a volume. The
failover domain of the resource group containing the volume will be the two
nodes that share the volume. If you have a cluster of nodes named venus, mercury, and pluto,
you could configure the following initial failover domains for resource groups
RG1 and RG2:
When you define a failover policy, you specify the initial
failover domain. The initial failover domain is used when a cluster
is first booted. The ordered list specified by the initial failover domain
is transformed into a runtime failover domain by
the failover script. With each failure, the failover script takes the current
run-time failover domain and potentially modifies it; the initial failover
domain is never used again. Depending on the run-time conditions and contents
of the failover script, the initial and run-time failover domains may be identical. Linux FailSafe stores the run-time failover domain and uses it as input
to the next failover script invocation. A failover attribute is a value that
is passed to the failover script and used by Linux FailSafe for the purpose
of modifying the run-time failover domain used for a specific resource group.
You can specify a failover attribute of Auto_Failback, Controlled_Failback, Auto_Recovery, or InPlace_Recovery.
Auto_Failback and Controlled_Failback are
mutually exclusive, but you must specify one or the other. Auto_Recovery and InPlace_Recovery are mutually exclusive,
but whether you specify one or the other is optional. A failover attribute of Auto_Failback specifies
that the resource group will be run on the first available node in the runtime
failover domain. If the first node fails, the next available node will be
used; when the first node reboots, the resource group will return to it. This
attribute is best used when some type of load balancing is required. A failover attribute of Controlled_Failback specifies
that the resource group will be run on the first available node in the runtime
failover domain, and will remain running on that node until it fails. If the
first node fails, the next available node will be used; the resource group
will remain on this new node even after the first node reboots.This attribute
is best used when client/server applications have expensive recovery mechanisms,
such as databases or any application that uses tcp to communicate. The recovery attributes Auto_Recovery and InPlace_Recovery determine the node on which a resource group will
be allocated when its state changes to online and a member of the group is
already allocated (such as when volumes are present). Auto_Recovery specifies that the failover policy will be used to allocate the
resource group; this is the default recovery attribute if you have specified
the Auto_Failback attribute. InPlace_Recovery specifies that the resource group will be allocated on the node
that already contains part of the resource group; this is the default recovery
attribute if you have specified the Controlled_Failback
attribute. See the Linux FailSafe Programmer's Guide for
a full discussions of example failover policies. To define a failover policy using the GUI, perform the following steps: Launch the FailSafe Manager. On the left side of the display, click on the “Failover
Policies & Resource Groups” category. On the right side of the display click on the “Define
a Failover Policy” task link to launch the task. Enter the selected inputs. Click on “OK” at the bottom of the screen to complete
the task.
To define a failover policy, enter the following command at the cmgr prompt to specify the name of the failover policy: cmgr> define failover_policy A |
The following prompt appears: When this prompt appears you can use the following commands to specify
the components of a failover policy: failover_policy A? set attribute to B
failover policy A? set script to C
failover policy A? set domain to D
failover_policy A? |
When you define a failover policy, you can set as many attributes and
domains as your setup requires, but executing the add attribute
and add domain commands with different values. The CLI
also allows you to specify multiple domains in one command of the following
format: failover_policy A? set domain to A B C ... |
The components of a failover policy are described in detail in the Linux FailSafe Programmer's Guide and in summary in Section 5.5.12. When you are finished defining the failover policy, enter done to return to the cmgr prompt. After you have defined a failover policy, you can modify or delete it. To modify a failover policy with the Cluster Manager GUI, perform the
following procedure: Launch the FailSafe Manager. On the left side of the display, click on the “Failover
Policies & Resource Groups” category. On the right side of the display click on the “Modify
a Failover Policy Definition” task link to launch the task. Enter the selected inputs. Click on “OK” at the bottom of the screen to complete
the task, or click on “Cancel” to cancel.
To delete a failover policy with the Cluster Manager GUI, perform the
following procedure: Launch the FailSafe Manager. On the left side of the display, click on the “Failover
Policies & Resource Groups” category. On the right side of the display click on the “Delete
a Failover Policy” task link to launch the task. Enter the selected inputs. Click on “OK” at the bottom of the screen to complete
the task, or click on “Cancel” to cancel.
Use the following CLI command to modify a failover policy: cmgr> modify failover_policy A |
You modify a failover policy using the same commands you use to define
a failover policy. You can use the following command to delete a failover policy definition: cmgr> delete failover_policy A |
You can use Linux FailSafe to display any of the following: The components of a specified failover policy All of the failover policies that have been defined All of the failover policy attributes that have been defined All of the failover policy scripts that have been defined
The Cluster Manager GUI provides a convenient display of failover policies
through the FailSafe Cluster View. You can launch the FailSafe Cluster View
directly, or you can bring it up at any time by clicking on the “FailSafe
Cluster View” prompt at the bottom of the “FailSafe Manager”
display. From the View menu of the FailSafe Cluster View, select Failover Policies
to see all defined failover policies. Use the following command to view the parameters of a defined failover
policy: cmgr> show failover_policy A |
Use the following command to view all of the defined failover policies: cmgr> show failover policies |
Use the following command to view all of the defined failover policy
attributes: cmgr> show failover_policy attributes |
Use the following command to view all of the defined failover policy
scripts: cmgr> show failover_policy scripts |
Resources are configured together into resource groups. A resource group is a collection of interdependent
resources. If any individual resource in a resource group becomes unavailable
for its intended use, then the entire resource group is considered unavailable.
Therefore, a resource group is the unit of failover for Linux FailSafe. For example, a resource group could contain all of the resources that
are required for the operation of a web node, such as the web node itself,
the IP address with which it communicates to the outside world, and the disk
volumes containing the content that it serves. When you define a resource group, you specify a failover
policy. A failover policy controls the behavior of a resource
group in failure situations. To define a resource group, you provide the following information: The name of the resource group, with a maximum length of 63
characters. The name of the cluster to which the resource group is available The resources to include in the resource group, and their
resource types The name of the failover policy that determines which node
will take over the services of the resource group on failure
Linux FailSafe does not allow resource groups that do not contain any
resources to be brought online. You can define up to 100 resources configured in any number of resource
groups. To define a resource group with the Cluster Manager GUI, perform the
following steps: Launch the FailSafe Manager. On the left side of the display, click on “Guided Configuration”. On the right side of the display click on “Set Up Highly
Available Resource Groups” to launch the task link. In the resulting window, click each task link in turn, as
it becomes available. Enter the selected inputs for each task. When finished, click “OK” to close the taskset
window.
To configure a resource group, enter the following command at the cmgr prompt to specify the name of a resource group and the cluster
to which the resource group is available: cmgr> define resource_group A [in cluster B] |
Entering this command specifies the name of the resource group you are
defining within a specified cluster. If you have specified a default cluster,
you do not need to specify a cluster in this command and the CLI will use
the default. The following prompt appears: Enter commands, when finished enter either "done" or "cancel"
resource_group A? |
When this prompt appears you can use the following commands to specify
the resources to include in the resource group and the failover policy to
apply to the resource group: resource_group A? add resource B of resource_type C
resource_group A? set failover_policy to D |
After you have set the failover policy and you have finished adding
resources to the resource group, enter done to return
to the cmgr prompt. For a full example of resource group creation using the Cluster Manager
CLI, see Section 5.7. After you have defined
resource groups, you can modify and delete the resource groups. You can change
the failover policy of a resource group by specifying a new failover policy
associated with that resource group, and you can add or delete resources to
the existing resource group. Note, however, that since you cannot have a resource
group online that does not contain any resources, Linux FailSafe does not
allow you to delete all resources from a resource group once the resource
group is online. Likewise, Linux FailSafe does not allow you to bring a resource
group online if it has no resources. Also, resources must be added and deleted
in atomic units; this means that resources which are interdependent must be
added and deleted together. To modify a failure policy with the Cluster Manager GUI, perform the
following procedure: Launch the FailSafe Manager. On the left side of the display, click on the “Failover
Policies & Resource Groups” category. On the right side of the display click on the “Modify
a Resource Group Definition” task link to launch the task. Enter the selected inputs. Click on “OK” at the bottom of the screen to complete
the task, or click on “Cancel” to cancel.
To add or delete resources to a resource group definition with the Cluster
Manager GUI, perform the following procedure: Launch the FailSafe Manager. On the left side of the display, click on the “Failover
Policies & Resource Groups” category. On the right side of the display click on the “Add/Remove
Resources in Resource Group” task link to launch the task. Enter the selected inputs. Click on “OK” at the bottom of the screen to complete
the task, or click on “Cancel” to cancel.
To delete a resource group with the Cluster Manager GUI, perform the
following procedure: Launch the FailSafe Manager. On the left side of the display, click on the “Failover
Policies & Resource Groups” category. On the right side of the display click on the “Delete
a Resource Group” task link to launch the task. Enter the selected inputs. Click on “OK” at the bottom of the screen to complete
the task, or click on “Cancel” to cancel.
Use the following CLI command to modify a resource group: cmgr> modify resource_group A [in cluster B] |
If you have specified a default cluster, you do not need to specify
a cluster in this command and the CLI will use the default. You modify a resource
group using the same commands you use to define a failover policy: resource_group A? add resource B of resource_type C
resource_group A? set failover_policy to D |
You can use the following command to delete a resource group definition: cmgr> delete resource_group A [in cluster B] |
If you have specified a default cluster, you do not need to specify
a cluster in this command and the CLI will use the default. You can display the parameters of a defined
resource group, and you can display all of the resource groups defined for
a cluster. The Cluster Manager GUI provides a convenient display of resource groups
through the FailSafe Cluster View. You can launch the FailSafe Cluster View
directly, or you can bring it up at any time by clicking on the “FailSafe
Cluster View” prompt at the bottom of the “FailSafe Manager”
display. From the View menu of the FailSafe Cluster View, select Groups to see
all defined resource groups. To display which nodes are currently running which groups, select “Groups
owned by Nodes.” To display which groups are running which failover
policies, select “Groups by Failover Policies.” Use the following command to view the parameters of a defined resource
group: cmgr> show resource_group A [in cluster B] |
If you have specified a default cluster, you do not need to specify
a cluster in this command and the CLI will use the default. Use the following command to view all of the defined failover policies: cmgr> show resource_groups [in cluster A] |
Linux FailSafe Administrator's Guide
(document number: 007-4322-002 / published: 2001-02-28)
table of contents | additional info | download
home/search |
what's new |
help
|