Chapter 5. Linux Platforms

CXFS supports a client-only node running the Red Hat Enterprise Linux (RHEL) or SUSE Linux Enterprise Server (SLES) operating system.


Note: Nodes that you intend to run as metadata servers must be installed as server-capable administration nodes; all other nodes should be client-only nodes. For information about server-capable administration nodes, see the CXFS 5 Administration Guide for SGI InfiniteStorage.

This chapter contains the following sections:

For information about system tunable parameters, see CXFS 5 Administration Guide for SGI InfiniteStorage.

CXFS on Linux

This section contains the following information about CXFS on Linux systems:

Requirements for Linux

In addition to the items listed in “Requirements” in Chapter 1, using a Linux node to support CXFS requires the following:

  • One of the following

    • RHEL

    • SLES

    See the release notes for the supported kernels, update levels, and service pack levels, plus information about SGI ProPack and SGI Foundation Software.

  • On Altix or Altix XE systems, serial lines and/or supported Fibre Channel switches. For supported switches, see the release notes. Either system reset or I/O fencing is required for all nodes.

  • A choice of at least one Fibre Channel host bus adapter (HBA), depending upon hardware type:

    • Altix or Altix XE hardware:

      • QLogic QLA2310, QLA2342, or QLA2344

      • LSI Logic LSI7104XP-LC, LSI7204XP-LC, or LSI7204EP-LC


        Note: The LSI HBA requires the 01030600 firmware.


    • Third-party hardware:

      • QLogic QLA2200, QLA2200F, QLA2310, QLA2342, QLA2344

      • LSI Logic LS17202XP-LC, LS17402XP-LC, LS17104XP-LC, LS17204XP-LC, LS17404XP-LC


        Note: The LSI HBA requires the 01030600 firmware or newer.


  • A CPU of the following class:

    • x86_64 architecture, such as:

      • AMD Opteron

      • Intel Xeon EM64T

    • ia64 architecture, such as Intel Itanium 2

    The machine must have at least the following minimum requirements:

    • 256 MB of RAM memory

    • Two Ethernet 100baseT interfaces

    • One empty PCI slot (to receive the HBA)

For the latest information, see the CXFS Linux release notes.


Note: If you use I/O fencing and ipfilterd on a node, the ipfilterd configuration must allow communication between the node and the telnet port on the switch. Also see “Configure Firewalls for CXFS Use” in Chapter 2.


CXFS Commands on Linux

The following commands are shipped as part of the CXFS Linux package:

/usr/cluster/bin/cxfs_config
/usr/cluster/bin/cxfs_client
/usr/cluster/bin/cxfs_info
/usr/cluster/bin/cxfscp
/usr/cluster/bin/cxfsdump
/usr/sbin/grioadmin
/usr/sbin/griomon
/usr/sbin/grioqos
/sbin/xvm

The cxfs_client and xvm commands are needed to include a client-only node in a CXFS cluster. The cxfs_info command reports the current status of this node in the CXFS cluster.

The rpm command output lists all software added; see “Linux Installation Procedure”.

For more information, see the man pages.

Log Files on Linux

The cxfs_client command creates a /var/log/cxfs_client log file. You should monitor the /var/log/cxfs_client and /var/log/messages log files for problems. Look for a Membership delivered message to indicate that a cluster was formed.

The Linux platform uses the logrotate system utility to rotate the CXFS logs (as opposed to other multiOS platforms, which use the -z option to cxfs_client):

  • The /etc/logrotate.conf file specifies how often system logs are rotated

  • The /etc/logrotate.d/cxfs_client file specifies the manner in which cxfs_client logs are rotated

For information about the log files created on server-capable administration nodes, see the CXFS 5 Administration Guide for SGI InfiniteStorage.

CXFS Mount Scripts on Linux

Linux supports the CXFS mount scripts. See “CXFS Mount Scripts” in Chapter 1 and the CXFS 5 Administration Guide for SGI InfiniteStorage.

For RHEL nodes, in order for cxfs-reprobe to appropriately probe all of the targets on the SCSI bus, you must define a group of environment variables in the /etc/cluster/config/cxfs_client.options file. For more information, see “Using cxfs-reprobe on IRIX Nodes” in Chapter 4.

Limitations and Considerations for Linux

Note the following:

  • On Linux systems, the use of XVM is supported only with CXFS; XVM does not support local Linux disk volumes.

  • On systems running SUSE Linux Enterprise Server 10 (SLES 10) that are greater than 64 CPUs, there are issues with using the md driver and CXFS. The md driver holds the BKL (Big Kernel Lock), which is a single, system-wide spin lock. Attempting to acquire this lock can add substantial latency to a driver's operation, which in turn holds off other processes such as CXFS. The delay causes CXFS to lose membership. This problem has been observed specifically when an md pair RAID split is done, such as the following:

    raidsetfaulty /dev/md1 /dev/path/to/partition

  • Although it is possible to mount other filesystems on top of a Linux CXFS filesystem, this is not recommended.

  • CXFS filesystems with XFS version 1 directory format cannot be mounted on Linux nodes.

  • The implementation of file creates using O_EXCL is not complete. Multiple applications running on the same node using O_EXCL creates as a synchronization mechanism will see the expected behavior (only one of the creates will succeed). However, applications running between nodes may not get the O_EXCL behavior they requested (creates of the same file from two or more separate nodes may all succeed).

  • The Fibre Channel HBA driver must be loaded before CXFS services are started. The HBA driver could be loaded early in the initialization scripts or be added to the initial RAM disk for the kernel. See the mkinitrd man page for more information.

  • RHEL 5 x86_64 nodes have a severely limited kernel stack size. To use CXFS on these nodes requires the following to avoid a stack overflow panic:

  • Case-insensitive CXFS filesystems are not supported on SLES 10 and RHEL client-only nodes. These nodes client will fail to mount the filesystem with messages such as the following:

    Preparing to mount CXFS file system "/dev/cxvm/tp91"
    XFS: bad version
    XFS: SB validate failed


    Note: Nodes that use enhanced XFS (SLES 10 nodes that are installed with the CXFS Edge Server software and SLES 11 nodes) do support case-insensitive filesystems.


See also Appendix B, “Filesystem and Logical Unit Specifications”.

Using the dmi Mount Option on a SLES 10 or SLES 11 Node

By default, DMAPI is turned off on SLES 10 and SLES 11 systems. If you want to mount CXFS filesystems on a SLES 10 or SLES 11 client-only node with the dmi mount option, you must set DMAPI_PROBE="yes" in the /etc/sysconfig/sysctl file on the node. Changes to the file will processed on the next reboot. After setting that system configuration file, you can immediately enable DMAPI by executing the following:

# sysctl -w fs.xfs.probe_dmapi=1

Access Control Lists and Linux

All CXFS files have UNIX mode bits (read, write, and execute) and optionally an access control list (ACL). For more information about POSIX ACLs, see the chmod and setfacl man pages.

HBA Installation for Linux

This section provides an overview of the Fibre Channel host bus adapter (HBA) installation information for Linux nodes.

The installation may be performed by you or by a qualified service representative for your hardware. See the Linux operating system documentation and the documentation for your hardware platform.

The driver requirements are as follows:

  • LSI Logic card: the drivers are supplied with the Linux kernel. The module names are mptscsih and mptfc. The LSI lsiutil command displays the number of LSI HBAs installed, the model numbers, and firmware versions.

  • QLogic card: the drivers are supplied with the Linux kernel.

You must ensure that the HBA driver is loaded prior to CXFS initialization by building the module into the initial RAM disk automatically or manually. For example, using the QLogic card and the qla2200 driver:

  • Automatic method: For RHEL, add a new line such as the following to the /etc/modprobe.conf file:

    alias scsi_hostadapter1 qla2200

    For SLES, add the driver name to the INITRD_MODULES variable in the /etc/sysconfig/kernel file. After adding the HBA driver into INITRD_MODULES, you must rebuild initrd with mkinitrd.


    Note: If the host adapter is installed in the box when the operating system is installed, this may not be necessary. Or hardware may be detected at boot time.

    When the new kernel is installed, the driver will be automatically included in the corresponding initrd image.

  • Manual method: recreate your initrd to include the appropriate HBA driver module. For more information, see the operating system documentation for the mkinitrd command.

You should then verify the appropriate initrd information:

  • If using the GRUB loader, verify that the following line appears in the /boot/grub/grub.conf file:

    initrd /initrd-version.img

  • If using the LILO loader, do the following:

    1. Verify that the following line appears in the appropriate stanza of /etc/lilo.conf:

      /boot/initrd-version.img

    2. Rerun LILO.

The system must be rebooted (and when using LILO, LILO must be rerun) for the new initrd image to take effect.

Instead of this procedure, you could also modify the /etc/rc.sysinit script to load the qla2200 driver early in the initscript sequence.

Preinstallation Steps for Linux

This section provides an overview of the steps that you will perform on your Linux nodes prior to installing the CXFS software. It contains the following sections:

Adding a Private Network for Linux

The following procedure provides an overview of the steps required to add a private network to the Linux system. A private network is required for use with CXFS. See “Use a Private Network” in Chapter 2.

You may skip some steps, depending upon the starting conditions at your site. For details about any of these steps, see the Linux operating system documentation.

  1. Edit the /etc/hosts file so that it contains entries for every node in the cluster and their private interfaces as well.

    The /etc/hosts file has the following format, where primary_hostname can be the simple hostname or the fully qualified domain name:

    IP_address    primary_hostname    aliases

    You should be consistent when using fully qualified domain names in the /etc/hosts file. If you use fully qualified domain names on a particular node, then all of the nodes in the cluster should use the fully qualified name of that node when defining the IP/hostname information for that node in their /etc/hosts file.

    The decision to use fully qualified domain names is usually a matter of how the clients (such as NFS) are going to resolve names for their client server programs, how their default resolution is done, and so on.

    Even if you are using the domain name service (DNS) or the network information service (NIS), you must add every IP address and hostname for the nodes to /etc/hosts on all nodes. For example:

    190.0.2.1 server1.company.com server1
    190.0.2.3 stocks
    190.0.3.1 priv-server1
    190.0.2.2 server2.company.com server2
    190.0.2.4 bonds
    190.0.3.2 priv-server2

    You should then add all of these IP addresses to /etc/hosts on the other nodes in the cluster.

    For more information, see the hosts and resolver man pages.


    Note: Exclusive use of NIS or DNS for IP address lookup for the nodes will reduce availability in situations where the NIS or DNS service becomes unreliable.


    For more information, see “Understand Hostname Resolution and Network Configuration Rules” in Chapter 2.

  2. Edit the /etc/nsswitch.conf file so that local files are accessed before either NIS or DNS. That is, the hosts line in /etc/nsswitch.conf must list files first. For example:

    hosts:      files nis dns

    (The order of nis and dns is not significant to CXFS, but files must be first.)

  3. Configure your private interface according to the instructions in the Network Configuration section of your Linux distribution manual. To verify that the private interface is operational, issue the following command:

    linux# ifconfig -a
    
    eth0      Link encap:Ethernet  HWaddr 00:50:81:A4:75:6A
              inet addr:192.168.1.1  Bcast:192.168.1.255  Mask:255.255.255.0
              UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
              RX packets:13782788 errors:0 dropped:0 overruns:0 frame:0
              TX packets:60846 errors:0 dropped:0 overruns:0 carrier:0
              collisions:0 txqueuelen:100
              RX bytes:826016878 (787.7 Mb)  TX bytes:5745933 (5.4 Mb)
              Interrupt:19 Base address:0xb880 Memory:fe0fe000-fe0fe038
    
    eth1      Link encap:Ethernet  HWaddr 00:81:8A:10:5C:34
              inet addr:10.0.0.10  Bcast:10.0.0.255  Mask:255.255.255.0
              UP BROADCAST MULTICAST  MTU:1500  Metric:1
              RX packets:0 errors:0 dropped:0 overruns:0 frame:0
              TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
              collisions:0 txqueuelen:100
              RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
              Interrupt:19 Base address:0xef00 Memory:febfd000-febfd038
    
    lo        Link encap:Local Loopback
              inet addr:127.0.0.1  Mask:255.0.0.0
              UP LOOPBACK RUNNING  MTU:16436  Metric:1
              RX packets:162 errors:0 dropped:0 overruns:0 frame:0
              TX packets:162 errors:0 dropped:0 overruns:0 carrier:0
              collisions:0 txqueuelen:0
              RX bytes:11692 (11.4 Kb)  TX bytes:11692 (11.4 Kb)

    This example shows that two ethernet interfaces, eth0 and eth1, are present and running (as indicated by UP in the third line of each interface description.

    If the second network does not appear, it may be that a network interface card must be installed in order to provide a second network, or it may be that the network is not yet initialized.

Modifications Required for CXFS GUI Connectivity Diagnostics for Linux

In order to test node connectivity by using the GUI, the root user on the node running the CXFS diagnostics must be able to access a remote shell using the rsh command (as root) on all other nodes in the cluster. (This test is not required when using cxfs_admin because it verifies the connectivity of each node as it is added to the cluster.)

There are several ways of accomplishing this, depending on the existing settings in the pluggable authentication modules (PAMs) and other security configuration files.

The following method works with default settings. Do the following on all nodes in the cluster:

  1. Install the rsh-server RPM.

  2. Enable rsh.

  3. Restart xinted.

  4. Add rsh to the /etc/securetty file.

  5. Add the hostname of the node from which you will be running the diagnostics into the /root/.rhosts file. Make sure that the mode of the .rhosts file is set to 600 (read and write access for the owner only).

After you have completed running the connectivity tests, you may wish to disable rsh on all cluster nodes.

For more information, see the Linux operating system documentation about PAM and the hosts.equiv man page.

Verifying the Private and Public Networks for Linux

For each private network on each Linux node in the pool, verify access with the ping command:

  1. Enable multicast ping using one or more of the following methods (the permanent method will not take affect until after a reboot):

    • Immediate but temporary method:

      linux# echo "0" > /proc/sys/net/ipv4/icmp_echo_ignore_broadcasts

      For more information, see http://kerneltrap.org/node/16225

    • Immediate but temporary method:

      linux# sysctl -w net.ipv4.icmp_echo_ignore_broadcasts=0"

    • Permanent method upon reboot (survives across reboots):

      1. Remove the following line (if it exists) from the /etc/sysctl.conf file:

        net.ipv4.icmp_echo_ignore_broadcasts = 1

      2. Add the following line to the /etc/sysctl.conf file:

        net.ipv4.icmp_echo_ignore_broadcasts = 0

  2. Execute a ping using the private network. Enter the following, where nodeIPaddress is the IP address of the node:

    ping nodeIPaddress

    For example:

    linux# ping 10.0.0.1
    PING 10.0.0.1 (10.0.0.1) from 128.162.240.141 : 56(84) bytes of data.
    64 bytes from 10.0.0.1: icmp_seq=1 ttl=64 time=0.310 ms
    64 bytes from 10.0.0.1: icmp_seq=2 ttl=64 time=0.122 ms
    64 bytes from 10.0.0.1: icmp_seq=3 ttl=64 time=0.127 ms

  3. Execute a ping using the public network.

  4. If ping fails, repeat the following procedure on each node:

    1. Verify that the network interface was configured up using ifconfig. For example:

      linux# ifconfig eth1
      eth1      Link encap:Ethernet  HWaddr 00:81:8A:10:5C:34
                inet addr:10.0.0.10  Bcast:10.0.0.255  Mask:255.255.255.0
                UP BROADCAST MULTICAST  MTU:1500  Metric:1
                RX packets:0 errors:0 dropped:0 overruns:0 frame:0
                TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
                collisions:0 txqueuelen:100
                RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
                Interrupt:19 Base address:0xef00 Memory:febfd000-febfd038

      In the third output line above, UP indicates that the interface was configured up.

    2. Verify that the cables are correctly seated.

  5. Repeat this procedure on each node.

Client Software Installation for Linux

This section discusses the following:

Linux Installation Procedure

The CXFS software will be initially installed and configured by SGI personnel. This section provides an overview of those procedures. You can use the information in this section to verify the installation.

Table 5-1 and Table 5-2 provide examples of the differences in package extensions among the various processor classes supported by CXFS.


Note: The kernel package extensions vary by architecture. Ensure that you install the appropriate package for your processor architecture.


Table 5-1. RHEL Processor and Package Extension Examples

Class

Example Processors

User Package Architecture Extension

Kernel Package Architecture Extension

x86_64

AMD Opteron

.x86_64.rpm

.x86_64.rpm

 

Intel Xeon EM64T

.x86_64.rpm

.x86_64.rpm

ia64

Intel Itanium 2

.ia64.rpm

.ia64.rpm


Table 5-2. SLES Processor and Package Extension Examples

Class

Example Processors

User and Kernel Package Architecture Extension

x86_64

AMD Opteron

.x86_64.rpm

 EM64T.x86_64.rpm

ia64

Intel Itanium 2

.ia64.rpm



Note: Specific packages listed here are examples and may not match the released product.

Installing the CXFS client software for Linux requires approximately 50-200 MB of space, depending upon the packages installed at your site.

To install the required software on a Linux node, do the following:

  1. Read the SGI InfiniteStorage Software Platform release notes, CXFS general release notes, and CXFS Linux release notes in the /docs directory on the ISSP DVD and any late-breaking caveats on Supportfolio.

  2. Verify that the node is running a supported Linux distribution and kernel, according to the CXFS for Linux release notes. See the Red Hat /etc/redhat-release or SLES /etc/SuSE-release files and enter the following:

    linux_cxfsclient# uname -r

  3. (Optional) Verify that the node is running the supported level of SGI Foundation Software and (optionally) SGI ProPack, according to the CXFS for Linux release notes. For more information, see the Start Here for the supported versions of SGI Foundation Software and SGI ProPack. Also install any required patches. See the releasenotes/README file for more information.

  4. If you had to install software in one of the above steps, reboot the system:

    linux_cxfsclient# /sbin/reboot

  5. Transfer the client-only software (that was downloaded onto a CXFS server-capable administration node during its installation procedure) from the server to the client using ftp, rcp, or scp.

    The location of the tarball on the server will be as follows:

    /usr/cluster/client-dist/CXFS_VERSION/linux/CLIENT_LINUX_VERSION/CLIENT_ARCHITECTURE/cxfs-client.tar.gz

    For example, for an Altix ia64 client, the location of the tarball on the server will be:

    /usr/cluster/client-dist/5.6.0.3/linux/sles10sp2/ia64/cxfs-client.tar.gz

    In this case, you could do the following:

    cxfs_server# cd /usr/cluster/client-dist/5.6.0.3/linux/sles10sp2/ia64
    cxfs_server# scp cxfs-client.tar.gz linux_cxfsclient:/tmp/cxfs/

  6. Disassemble the downloaded tarball on the Linux client-only node. For example:

    linux_cxfsclient# cd /tmp/cxfs
    linux_cxfsclient# tar -zxvf tarball

    After you extract the information using tar, the RPMs will be in the following directory:

    /tmp/cxfs/sgi-install/SGI/RPMS

  7. Install the CXFS software:

    • For RHEL:

      • Including GRIOv2:

        rhel_cxfsclient# rpm -Uvh *.rpm

      • Without GRIOv2:

        rhel_cxfsclient# rpm -Uvh cxfs*rpm sgi-*-kmp-*rpm sgi*rpm

    • For SLES, where Kernelvariant is either smp (SLES 10) or default (SLES 10 or SLES 11):

      • Including GRIOv2:

        sles_cxfsclient# rpm -Uvh cxfs*rpm grio2*rpm sgi-*-kmp-Kernelvariant-*rpm

      • Without GRIOv2:

        sles_cxfsclient# rpm -Uvh cxfs*rpm sgi-*-kmp-Kernelvariant-*rpm

  8. Edit the /etc/cluster/config/cxfs_client.options file as necessary. See the “Maintenance for Linux” and the cxfs_client(1M) man page.

  9. Reboot the system:

    linux_cxfsclient# reboot

Installing the Performance Co-Pilot Agent

The cxfs_utils package includes a Performance Co-Pilot (PCP) agent for monitoring CXFS heartbeat, CMS status and other statistics. If you want to use this feature, you must also install the following PCP packages:

  • pcp-open from the SGI Foundation Software release

  • pcp-sgi from the SGI ProPack release

These packages and are included with SGI Foundation Software. You can obtain the open source PCP package from ftp://oss.sgi.com/projects/pcp/download

Verifying the Linux Installation

Use the uname -r command to ensure the kernel installed above is running.

To verify that the CXFS software has been installed properly, use the rpm -qa command to display all of the installed packages. You can filter the output by searching for particular package name.

I/O Fencing for Linux

I/O fencing is required on Linux nodes in order to protect data integrity of the filesystems in the cluster. The cxfs_client software automatically detects the world wide port names (WWPNs) of any supported host bus adapters (HBAs) for Linux nodes that are connected to a switch that is configured in the cluster database. These HBAs are available for fencing.

However, if no WWPNs are detected, the following message will be logged to the /var/log/cxfs_client file:

cis_get_hbas no local HBAs found - falling back to /etc/fencing.conf

If no WWPNs are detected, you can manually specify the WWPNs in the fencing file.


Note: This method does not work if the WWPNs are partially discovered.

The /etc/fencing.conf file enumerates the WWPNs for all of the HBAs that will be used to mount a CXFS filesystem. There must be a line for each HBA WWPN as a 64-bit hexadecimal number.


Note: The WWPN is that of the HBA itself, not any of the devices that are visible to that HBA in the fabric.

You must update the /etc/fencing.conf file whenever the HBA configuration changes, including the replacement of an HBA.

For dual-ported HBAs, the file must include the WWPNs of any ports that are used to access cluster disks. This may result in multiple WWPNs per HBA in the file; the numbers will probably differ by a single digit. For example, if you determined that port 0 is the port connected to the switch, your fencing file should contain the following (comment lines begin with #):

# WWPN of the HBA installed on this system
#
2000000173002c0b

To configure fencing, see the CXFS 5 Administration Guide for SGI InfiniteStorage.

Start/Stop cxfs_client for Linux

The cxfs_client service will be invoked automatically during normal system startup and shutdown procedures. This script starts and stops the cxfs_client daemon.

To start up cxfs_client manually, enter the following:

linux# service cxfs_client start
Loading cxfs modules:                                      [  OK  ]
Mounting devfs filesystems:                                [  OK  ]
Starting cxfs client:                                      [  OK  ]

To stop cxfs_client manually, enter the following:

linux# service cxfs_client stop
Stopping cxfs client:                                      [  OK  ]

To stop and then start cxfs_client manually, enter the following:

linux# service cxfs_client restart
Stopping cxfs client:                                      [  OK  ]

To see the current status, use the status argument. For example:

linux# service cxfs_client status
cxfs_client status [timestamp Apr 20 14:54:30 / generation 4364]

CXFS client:
    state: stable (5), cms: up, xvm: up, fs: up
Cluster:
    connies_cluster (707) - enabled
Local:
    ceara (7) - enabled
Nodes:
    aiden      enabled  up    12    
    brenna     enabled  DOWN  10    
    brigid     enabled  up    11    
    ceara      enabled  up    7     
    chili      enabled  up    4     
    cxfsibm2   enabled  up    9     
    cxfssun4   enabled  up    5     
    daghada    enabled  up    8     
    flynn      enabled  up    2     
    gaeth      enabled  up    0     
    minnesota  enabled  up    6     
    rowan      enabled  up    3     
    rylie      enabled  up    1     
Filesystems:
    concatfs   enabled  mounted          concatfs             /concatfs
    stripefs   enabled  mounted          stripefs             /stripefs
    tp9300_stripefs enabled  forced mounted   tp9300_stripefs      /tp9300_stripefs
cxfs_client is running.

For example, if cxfs_client is stopped:

linux# service cxfs_client status
cxfs_client is stopped

Maintenance for Linux

This section contains information about maintenance procedures for CXFS on Linux:

Modifying the CXFS Software for Linux

You can modify the behavior of the CXFS client daemon ( cxfs_client) by placing options in the /etc/cluster/config/cxfs_client.options file. The available options are documented in the cxfs_client man page.


Caution: Some of the options are intended to be used internally by SGI only for testing purposes and do not represent supported configurations. Consult your SGI service representative before making any changes.

To see if cxfs_client is using the options in cxfs_client.options, enter the following:

linux# ps -ax | grep cxfs_client
 3612 ?        S      0:00 /usr/cluster/bin/cxfs_client -i cxfs3-5
 3841 pts/0    S      0:00 grep cxfs_client

To be sure that cxfs_client is configured to start up on boot, view the chkconfig output, which should appear similar to the following:

linux# chkconfig --list | grep cxfs_client
cxfs_client               0:off  1:off  2:off  3:on   4:off  5:on   6:off

Recognizing Storage Changes for Linux

On Linux nodes, the cxfs-enumerate-wwns script enumerates the world wide names (WWNs) on the host that are known to CXFS. See “CXFS Mount Scripts” in Chapter 1.

The following script is run by cxfs_client when it reprobes the Fibre Channel controllers upon joining or rejoining membership:

/var/cluster/cxfs_client-scripts/cxfs-reprobe

For RHEL nodes, you can define a group of environment variables in the /etc/cluster/config/cxfs_client.options file in order for cxfs-reprobe to probe specific targets on the SCSI bus.

The script detects the presence of the SCSI and/or XSCSI layers on the system and defaults to probing whichever layers are detected. You can override this decision by setting CXFS_PROBE_SCSI and/or CXFS_PROBE_XSCSI to either 0 (to disable the probe) or 1 (to force the probe) on the appropriate bus.

When an XSCSI scan is performed, all buses are scanned by default. You can override this decision by specifying a space-separated list of buses in CXFS_PROBE_XSCSI_BUSES. (If you include space, you must enclose the list within single quotation marks.) For example:

export CXFS_PROBE_XSCSI_BUSES='/dev/xscsi/pci0001:00:03.0-1/bus /dev/xscsi/pci0002:00:01.0-2/bus'

When a SCSI scan is performed, a fixed range of buses/channels/IDs and LUNs are scanned; these ranges may need to be changed to ensure that all devices are found. The ranges can also be reduced to increase scanning speed if a smaller space is sufficient.

The following summarizes the environment variables (separate multiple values by white space and enclose withing single quotation marks):

CXFS_PROBE_SCSI=0|1
 

Stops (0) or forces (1) a SCSI probe. Default: 1 if SCSI

CXFS_PROBE_SCSI_BUSES=BusList
 

Scans the buses listed. Default: 0 1 2

CXFS_PROBE_SCSI_CHANNELS=ChannelList
 

Scans the channels listed. Default: 0

CXFS_PROBE_SCSI_IDS=IDList
 

Scans the IDS listed. Default: 0 1 2 3

CXFS_PROBE_SCSI_LUNS=LunList
 

Scans the LUNs listed. Default: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

CXFS_PROBE_XSCSI=0|1
 

Stops (1) or forces (1) an XSCSI probe. Default: 1 if XSCSI

CXFS_PROBE_XSCSI_BUSES=BusList
 

Scans the buses listed. Default: all XSCSI buses

For example, the following would only scan the first two SCSI buses:

export CXFS_PROBE_SCSI_BUSES='0 1'

The following would scan 16 LUNs on each bus, channel, and ID combination (all on one line):

export CXFS_PROBE_SCSI_LUNS='0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15'

Other options within the /etc/cluster/config/cxfs_client.options file begin with a - character. Following is an example cxfs_client.options file:

# Example cxfs_client.options file
#
-Dnormal -serror
export CXFS_PROBE_SCSI_BUSSES=1
export CXFS_PROBE_SCSI_LUNS='0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20'


Note: The - character or the term export must start in the first position of each line in the cxfs_client.options file; otherwise, they are ignored by the cxfs_client service.


Using cxfs-reprobe with RHEL

When cxfs_client needs to rescan disk buses, it executes the /var/cluster/cxfs_client-scripts/cxfs-reprobe script. This requires the use of parameters in RHEL due to limitations in the SCSI layer. You can export these parameters from the /etc/cluster/config/cxfs_client.options file.

The script detects the presence of the SCSI and/or XSCSI layers on the system and defaults to probing whichever layers are detected. You can override this decision by setting CXFS_PROBE_SCSI (for Linux SCSI) or CXFS_PROBE_XSCSI (for Linux XSCSI) to either 0 (to disable the probe) or 1 (to force the probe).

When an XSCSI scan is performed, all buses are scanned by default. You can override this by specifying a space-separated list of buses in CXFS_PROBE_XSCSI_BUSES. (If you include space, you must enclose the list within single quotation marks.) For example:

export CXFS_PROBE_XSCSI_BUSES='/dev/xscsi/pci01.03.0-1/bus /dev/xscsi/pci02.01.0-2/bus'

When a SCSI scan is performed, a fixed range of buses/channels/IDs and LUNs are scanned; these ranges may need to be changed to ensure that all devices are found. The ranges can also be reduced to increase scanning speed if a smaller space is sufficient.

The following summarizes the environment variables (separate multiple values by white space and enclose withing single quotation marks):

CXFS_PROBE_SCSI=0|1
 

Stops (0) or forces (1) a SCSI probe. Default: 1 if SCSI

CXFS_PROBE_SCSI_BUSES=BusList
 

Scans the buses listed. Default: 0 1 2

CXFS_PROBE_SCSI_CHANNELS=ChannelList
 

Scans the channels listed. Default: 0

CXFS_PROBE_SCSI_IDS=IDList
 

Scans the IDS listed. Default: 0 1 2 3

CXFS_PROBE_SCSI_LUNS=LunList
 

Scans the LUNs listed. Default: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

CXFS_PROBE_XSCSI=0|1
 

Stops (1) or forces (1) an XSCSI probe. Default: 1 if XSCSI

CXFS_PROBE_XSCSI_BUSES=BusList
 

Scans the buses listed. Default: all XSCSI buses

For example, the following would only scan the first two SCSI buses:

export CXFS_PROBE_SCSI_BUSES='0 1'

The following would scan 16 LUNs on each bus, channel, and ID combination (all on one line):

export CXFS_PROBE_SCSI_LUNS='0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15'

Other options within the /etc/cluster/config/cxfs_client.options file begin with a - character. Following is an example cxfs_client.options file:

# Example cxfs_client.options file
#
-Dnormal -serror
export CXFS_PROBE_SCSI_BUSSES=1
export CXFS_PROBE_SCSI_LUNS='0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20'


Note: The - character or the term export must start in the first position of each line in the cxfs_client.options file; otherwise, they are ignored by the cxfs_client service.


GRIO on Linux

CXFS supports guaranteed-rate I/O (GRIO) version 2 on the Linux platform if GRIO is enabled on the server-capable administration node. However, GRIO is disabled by default on Linux client-only nodes. To enable GRIO on a Linux client-only node, you must install the GRIO software as documented in “Linux Installation Procedure” and do the following:

  1. Change the following line in /etc/cluster/config/cxfs_client.options from:

    export GRIO2=off

    to:

    export GRIO2=on

  2. Reboot the system.

Application bandwidth reservations must be explicitly released by the application before exit. If the application terminates unexpectedly or is killed, its bandwidth reservations are not automatically released and will cause a bandwidth leak. If this happens, the lost bandwidth could be recovered by rebooting the node.

A Linux node can mount a GRIO-managed filesystem and supports node-level reservations. A Linux node will interoperate with the dynamic bandwidth allocator for all I/O outside of any reservation.

For more information, see “Guaranteed-Rate I/O (GRIO) and CXFS” in Chapter 1 and the Guaranteed-Rate I/O Version 2 for Linux Guide.

XVM Failover V2 on Linux

Following is an example of the /etc/failover2.conf file on a Linux system (this could be RHEL, SLES, or SGI Foundation Software):

/dev/disk/by-path/pci-0000:06:02.1-fc-0x200800a0b8184c8e:0x0000000000000000 affinity=0 preferred
/dev/disk/by-path/pci-0000:06:02.1-fc-0x200900a0b8184c8d:0x0000000000000000 affinity=1

Following is an example of the /etc/failover2.conf file on a Linux SGI Foundation Software system:

/dev/xscsi/pci0004:00:01.1/node200900a0b813b982/port1/lun4/disc, affinity=1
/dev/xscsi/pci0004:00:01.1/node200900a0b813b982/port2/lun4/disc, affinity=2 
/dev/xscsi/pci0004:00:01.0/node200900a0b813b982/port1/lun4/disc, affinity=1
/dev/xscsi/pci0004:00:01.0/node200900a0b813b982/port2/lun4/disc, affinity=2
/dev/xscsi/pci0004:00:01.1/node200800a0b813b982/port1/lun4/disc, affinity=4
/dev/xscsi/pci0004:00:01.1/node200800a0b813b982/port2/lun4/disc, affinity=3 preferred
/dev/xscsi/pci0004:00:01.0/node200800a0b813b982/port1/lun4/disc, affinity=4
/dev/xscsi/pci0004:00:01.0/node200800a0b813b982/port2/lun4/disc, affinity=3

For more information, see:

Troubleshooting for Linux

This section discusses the following:

For general troubleshooting information, see Chapter 10, “General Troubleshooting” and Appendix D, “Error Messages”.

Device Filesystem Enabled for Linux

The kernels provided for the Linux node have the Device File System (devfs) enabled. This can cause problems with locating system devices in some circumstances. See the devfs FAQ at the following location:

http://www.atnf.csiro.au/people/rgooch/linux/docs/devfs.html

The cxfs_client Daemon is Not Started on Linux

Confirm that the cxfs_client is not running. The following command would list the cxfs_client process if it were running:

linux# ps -ax | grep cxfs_client

Check the cxfs_client log file for errors.

Restart cxfs_client as described in “Start/Stop cxfs_client for Linux” and watch the cxfs_client log file for errors.

To be sure that cxfs_client is configured to start up on boot, view the chkconfig output, which should appear similar to the following:

linux# chkconfig --list | grep cxfs_client
cxfs_client               0:off  1:off  2:off  3:on   4:off  5:on   6:off

Filesystems Do Not Mount on Linux

If cxfs_info reports that cms is up but XVM or the filesystem is in another state, then one or more mounts is still in the process of mounting or has failed to mount.

The CXFS node might not mount filesystems for the following reasons:

  • The node may not be able to see all of the LUNs. This is usually caused by misconfiguration of the HBA or the SAN fabric:

    • Check that the ports on the Fibre Channel switch connected to the HBA are active. Physically look at the switch to confirm the light next to the port is green, or remotely check by using the switchShow command.

    • Check that the HBA configuration is correct.

    • Check that the HBA can see all the LUNs for the filesystems it is mounting.

    • Check that the operating system kernel can see all the LUN devices.

    • If the RAID device has more than one LUN mapped to different controllers, ensure the node has a Fibre Channel path to all relevant controllers.

  • The cxfs_client daemon may not be running. See “The cxfs_client Daemon is Not Started on Linux ”.

  • The filesystem may have an unsupported mount option. Check the cxfs_client.log for mount option errors or any other errors that are reported when attempting to mount the filesystem.

  • The cluster membership (cms), XVM, or the filesystems may not be up on the node. Execute the /usr/cluster/bin/cxfs_info command to determine the current state of cms, XVM, and the filesystems. If the node is not up for each of these, then check the /var/log/cxfs_client log to see what actions have failed.

    Do the following:

    • If cms is not up, check the following:

    • If XVM is not up, check that the HBA is active and can see the LUNs.

    • If the filesystem is not up, check that one or more filesystems are configured to be mounted on this node and check the /var/log/cxfs_client file for mount errors.

Unable to use the dmi Mount Option

By default, DMAPI is turned off on SLES 10 and SLES 11 systems. If you try to mount with the dmi mount option, you will see errors such as the following:

kernel: XFS: unknown mount option [dmi]."

See “Using the dmi Mount Option on a SLES 10 or SLES 11 Node”.

Large Log Files on Linux

The /var/log/cxfs_client log file may become quite large over a period of time if the verbosity level is increased.

See the cxfs_client.options man page and “Log Files on Linux”.

xfs off Output from chkconfig

The following output from chkconfig --list refers to the X Font Server, not the XFS filesystem, and has no association with CXFS:

xfs                       0:off  1:off  2:off  3:off  4:off  5:off  6:off

Reporting Linux Problems

Before reporting a problem to SGI, you should run the cxfsdump command:

linux# cxfsdump

This will collect the following information:

  • System information

  • CXFS registry settings

  • CXFS client logs

  • CXFS version information

  • Network settings

  • Event log

The cxfsdump -help command displays a help message.

Send the tar.gz file that is created in the /var/cluster/cxfsdump-data/date_time directory to SGI.

Gather the following information:

  • Obtain information about the entire cluster by running the cxfsdump utility on a server-capable administration node. See the information in the CXFS 5 Administration Guide for SGI InfiniteStorage.

  • Number of LSI HBAs installed, the model numbers, and firmware versions:

    linux# lsiutil

  • Any messages that appeared in the system logs immediately before the system exhibited the problem.

  • The debugger information from the kdb built-in kernel debugger for SGI Foundation Software systems on an SGI Altix ia64 system after a system kernel panic.


    Caution: When the system enters the debugger after a panic, it will render the system unresponsive until the user exits from the debugger. Also, if kdb is entered while the system is in graphical (X) mode, the debugger prompt cannot be seen. For these reasons, kdb is turned off by default.

    You can temporarily enable kdb by entering the following:

    linux# echo 1 > /proc/sys/kernel/kdb

    To enable kdb at every boot, place the following entry in the /etc/sysctl.conf file:

    # Turn on KDB
    kernel.kdb = 1

    For more information, see the sysctl man page.

    When kdb is enabled, a system panic will cause the debugger to be invoked and the keyboard LEDs will blink. The kdb prompt will display basic information. To obtain a stack trace, enter the bt command at the kdb prompt:

    kdb> bt

    To get a list of current processes, enter the following:

    kdb> ps

    To backtrace a particular process, enter the following, where PID is the process ID:

    kdb> btp PID

    To exit the debugger, enter the following:

    kdb> go

    If the system will be run in graphical mode with kdb enabled, SGI highly recommends that you use kdb on a serial console so that the kdb prompt can be seen.

  • Fibre Channel HBA World Wide name mapping:

    cat /sys/class/fc_transport/bus_ID/node_name

    For example:

    cat /sys/class/fc_transport/11:0:0:0/node_name

    The bus_ID value is the output of hwinfo --disk in the SysFS BusID field.