SGI Techpubs Library

Linux  »  Books  »  Administrative  »  
SGI Management Center for SGI ICE X
(document number: 007-5787-002 / published: 2012-12-04)    table of contents  |  additional info  |  download
find in page

Chapter 3. Installing and Configuring an SGI ICE X System

SGI installs operating system software on each ICE X system before factory shipment occurs. The topics in this chapter include the additional procedures that you need to complete in order to configure the system for your site.

If you want to completely reinstall the operating system and all other software, the topics in this chapter enable you to complete that task. For example, you might need to reinstall the operating system to meet site requirements or to recover a system in case of a disaster.

This chapter includes the following topics:

  • .


Note: If you are upgrading from a prior release or installing SMC for SGI ICE X software patches, see “Installing SMC for SGI ICE Patches and Updating SGI ICE Systems ”.


Performing a New Installation and Configuring the Software on an SGI ICE X System

Table 3-1 shows the installation and configuration process for a situation in which you want to install the SGI ICE X system from scratch. In this case, you reinstall the operating system on the nodes and configure everything yourself.

Table 3-1. SGI ICE X System Installation and Configuration Process

Step

Task

See

1

Prepare to install the SGI ICE X software

“Preparing to Install Software on an SGI ICE X System”

2

Configure a static address for the baseboard management controller (BMC) on the system admin controller (SAC).

If you plan to configure a highly available system admin controller (SAC), configure the BMC on each of the two SACs.

“Setting a Static IP Address for the Baseboard Management Controller (BMC) in the System Admin Node (SAC)”

3

(Optional) Configure a highly available system admin controller

“(Optional) Configuring a Highly Available (HA) System Admin Controller (SAC)”

4

Boot the system

“Booting the System”

5

Install the operating system on the system admin controller (SAC) node

“Installing the Operating System”

6

Run the cluster configuration tool. Complete the initial cluster set-up tasks, which include the following:

  • Set up software repositories for required and optional software

  • Install the SAC software

  • Configure network settings

  • Configure the NTP server

  • Set up the initial SAC infrastructure

  • Configure the house network DNS resolvers

“Running the Cluster Configuration Tool”

7

(Conditional) Customize the cluster configuration

“(Conditional) Customizing the Cluster Configuration”

8

Install the SGI Management Center (SMC) license key

“Installing the SGI Management Center License Key”

9

Sync the repository updates, apply the latest patches to the newly installed software, and clone the images

“Synchronizing the Software Repository, Installing Software Updates, and Cloning the Images”

10

Configure the switches

“Configuring the Switches”

11

Use the discover command to install and configure the rack leader controller and service node software

“discover Command”

12

Run the cluster configuration tool to configure the following:

  • (Optional) The backup DNS server

  • The InfiniBand network

.

13

(Optional) Configure the backup DNS server

.

14

Configure the InfiniBand network

.


Preparing to Install Software on an SGI ICE X System

The following procedure lists pre-installation tasks that you need to complete before you begin working with the SGI ICE X system.

Procedure 3-1. To prepare for an installation

  1. Contact your site's network administrator, and obtain network information.

    Obtain the following information to use when you configure the baseboard management controller (BMC):

    • (Optional) The current IP address of the BMC on the system admin node (SAC). You can set the BMC address from a serial console if you do not have this information.

    • The address you want to set for the BMC.

    • The netmask you want to set for the BMC.

    • The default gateway you want to set for the BMC.

    Your network administrator can provide an IP address, a hostname, or a fully qualified domain name (FQDN) for each of the preceding addresses.

    Obtain the following information to use when you configure the network for the SGI ICE X system:

    • Hostname

    • Domain name

    • IP address

    • Netmask

    • Default route

    • Root password

    Obtain the following information about your site's house network:

    • IP addresses of the domain name servers (DNSs)

  2. Familiarize yourself with the boot parameters, and determine which boot parameters you want to use.

    You can configure your SGI ICE X system to boot from one, two (default), three, four, or five partitions. This enables you to configure your SGI ICE X system as either a single-boot computer system or as a multiple-boot computer system. A multiple-boot computer system has two or more partitions, so it has more than one root directory (/) and more than one boot directory (/boot). In an SGI ICE X system, these root and boot directories are paired into multiple slots . A multiple-slot disk layout is also called a cascading dual-root layout or a cascading dual-boot layout.

    The installation procedure explains how to create a default, two-slot SGI ICE X system and directs you to use the install boot parameter. If you want to create only one slot, or if you want to create three or more slots, you need to specify different boot parameters.

    The installer creates the same disk layout on all compute nodes. For more information about boot parameters, disk layouts, and so on, see “Boot Parameters, Disk Partitioning, and Managing a Multiboot System ” in Chapter 4.

  3. Obtain the following from your SGI representative:

    • The MAC file for your system. The MAC file contains MAC address information for the nodes. If you have these addresses, the node discovery process can complete more quickly.

Setting a Static IP Address for the Baseboard Management Controller (BMC) in the System Admin Node (SAC)

When you set the IP address for the BC on the SAC, you ensure access to the SAC when the site DHCP server is inaccessible. If you want to configure a highly available SAC, make sure to perform this topic's procedure on the BMCs on each of the two SACs.

The following procedure explains how to set a static IP address.

Procedure 3-2. To set a static IP address for the BMC on the SAC

  1. Connect to the console on the SAC.

    You can make this connection in one of the following ways:

    • Use the terminal attached to the SAC.

    • Attach a keyboard, monitor, and mouse to the baseboard management controller (BMC) on the SAC.

    • Use a PC or workstation to connect to the BMC on the SAC over the network, and log in through the IPMI tool. This method assumes that you know the IP address of the BMC. Complete the following steps:

      1. Type the following command to obtain a console:

        ipmitool --H address --I lanplus --U ADMIN --P ADMIN sol activate

        For address, type the IP address, hostname, or FQDN of the BMC on the SAC.

      2. Type the following command to ensure that the IPMI tool is enabled whenever you reboot the SAC:

        chkconfig ipmi on

      3. Type the following command to start the IPMI service:

        service ipmi start

      4. Type the following command to configure the network on the BMC:

        ipmitool lan set ipaddr IP_addr netmask netmask_addr defgw gateway_addr

        For each of the addresses, specify the addresses you obtained from the network administrator in “Preparing to Install Software on an SGI ICE X System”.

  2. Proceed to one of the following:

(Optional) Configuring a Highly Available (HA) System Admin Controller (SAC)

SGI enables you to configure the SAC and your rack leader controllers (RLCs) as highly available nodes in an SGI ICE X system. If you want to enable high availability, use the information in the following appendix:

Appendix B, “Installing a Highly Available System Admin Controller (SAC) or Rack Leader Controller (RLC)”

Booting the System

The following procedure explains how to boot the system and begin the installation.

Procedure 3-3. To boot the system

  1. (Conditional) Power-off the system admin controller (SAC).

    Perform this step only if the SAC is powered on at this time.

  2. Power-on the SAC.

    As Figure 3-1 shows, the power-on button is on the right of the SAC.

    Figure 3-1. SAC Power On Button and DVD Drive

    SAC Power On Button and DVD Drive

  3. (Optional) Back up the cluster configuration snapshot in the /opt/sgi/var/ivt directory on the SAC.

    If you back up your system's factory configuration now, you can use the interconnect verification tool (IVT) to verify your hardware configuration later.

    You can use ftp or scp to copy the IVT files to another server at your site, or you can write the IVT files to a USB stick. For more information about IVT, see “Inventory Verification Tool” in Chapter 6.

  4. (Optional) Configure the system so that you can perform the installation from a VGA screen and can perform later operations from a serial console.

    If you want to enable this capability, perform the following steps:

    • Use a text editor to open file /boot/grub/menu.lst .

    • Search the file for the word kernel at the beginning of a line.

    • Add the following to the kernel line: console=type.

      For example:

      kernel /boot/vmlinuz-2.6.16.46-0.12-smp root=/dev/disk/by-label/sgiroot console=ttyS1,38400n8
      splash=silent showopts

    • Add the console=type parameter to the end of every kernel line.

    Later, if you want to access the SAC from only a VGA, you can remove the console= parameters.

  5. Insert the SGI Admin Node Autoinstallation DVD into the DVD drive on the system admin controller (SAC).

    The autoinstallation message appear, and at the end is the boot: prompt.

  6. At the boot: prompt, type install and, optionally, other boot parameters.

    “Boot Parameters, Disk Partitioning, and Managing a Multiboot System ” in Chapter 4 explains the other, optional, boot parameters.

    Monitor the installation. This can take several minutes.

  7. Remove the operating system installation DVD.

  8. At the # prompt, type reboot .

    This is the first boot from the SAC's hard disk.

  9. Proceed to one of the following:

Installing the Operating System

The SGI ICE X platform supports both the SUSE Linux Enterprise Server (SLES) and Red Hat Enterprise Linux (RHEL) operating systems. Use one of the following procedures to install your operating system software on the system admin controller (SAC) node:

Installing SUSE Linux Enterprise Server (SLES)

The SLES YaST2 interface enables you to install the SLES operating system on the SGI ICE X system. To navigate the YaST2 modules, use key combinations such as Tab (forward) Shift + Tab (backward). You can use the arrow keys to move up, down, left, and right. To use shortcuts, press the Alt + the highlighted letter. Press Enter to complete or confirm an action. Ctrl + L refreshes the screen. For more information about navigation, see Appendix C, “YaST2 Navigation”.

The following procedure explains how to use YaST2 to install SLES 11 SP2 on an SGI ICE X system. Use the following keys to navigate the YaST2 interface:

Procedure 3-4. To install SLES 11 SP2 on an SGI ICE X system

  1. On the Language and Keyboard Layout screen, complete the following steps:

    • Select your language

    • Select your keyboard layout

    • Select Next.

  2. On the Welcome screen, select Next.

  3. On the Hostname and Domain Name screen, complete the following steps:

    • Type the hostname for this SGI ICE X system.

    • Type the domain name.

    • Clear the box next to Change Hostname via DHCP . The box appears with an X in it by default, but you need to clear this box.

    • Select Assign Hostname to Loopback IP. Put an X in this box.

    • Select Next.

  4. On the Network Configuration screen, complete the following steps:

    • Select Change. A pop-up window appears.

    • On the pop-up window, choose Network Interfaces.

  5. On the Network Settings screen, complete the following steps:

    • Highlight the first network interface card that appears underneath Name.

    • Select Edit.

  6. On the Network Card Setup screen, specify the system admin controller's (SAC's) house/public network interface.

    Figure 3-2 shows the Network Card Setup screen.

    Figure 3-2. Network Card Setup Screen

    Network Card Setup Screen

    Complete the following steps:

    • Select Statically Assigned IP Address. SGI recommends a static IP address, not DHCP, for system admin nodes (SACs).

    • In the IP Address field, type the system's IP address.

    • In the Subnet Mask field, type the system's subnet mask.

    • In the Hostname field, type the system's fully qualified domain name (FQDN). SGI requires you to type an FQDN, not the system's shorter hostname, into this field. For example, type mysystem-admin.mydomainname.com. Failure to supply an FQDN in this field causes the configure-cluster command to fail.

    • Select Next.

    You can specify the default route, if needed, in a later step.

  7. On the Network Settings screen, complete the following steps:

    • Select Hostname/DNS.

    • In the Hostname field, type the system's fully qualified domain name (FQDN).

    • In the Domain Name field, type the domain name for your site.

    • Put an X in the box next to Assign Hostname to Loopback IP.

    • In the Name Servers and Domain Search List , type the name servers for your house network.

    • Back at the top of the screen, select Routing .

      The Network Settings > Routing screen appears.

    • In the Default Gateway field, type your site's default gateway.

    • Select OK.

  8. On the Network Configuration screen, click Next.

    The Saving Network Configuration screen appears and saves your configuration.

  9. On the Clock and Time Zone screen, complete the following steps:

    • Select your region.

    • Select your time zone.

    • (Optional) In the Hardware Clock Set To field, choose Local Time or accept the default of UTC.

    • Select Next.

    This step synchronizes the time in the BIOS hardware with the time in the operating system. Your choice depends on how the BIOS hardware clock is set. If the clock is set to GMT, which corresponds to UTC, your system can rely on the operating system to switch from standard time to daylight savings time and back automatically.

  10. On the Password for System Administrator “root” screen, complete the following steps:

    • In the Password for root User field, type the password you want to use for the root user for all compute nodes (SAC, rack leader controller (RLC), and service nodes) throughout the SGI ICE X system.

    • In the Confirm password field, type the root user's password again.

    • In the Test Keyboard Layout field, type a few characters.

      For example, if you specified a language other than English, type a few characters that are unique to that language. If these characters appear in this plain text field, you can use these characters in passwords safely.

    • Select Next.

  11. On the User Authentication Method screen, select one of the authentication methods and select Next.

    Typically, users accept the default (Local).

  12. On the New Local User screen, create additional user accounts or select Next.

    If you do not create additional users, select Yes on the Empty User Login warning pop-up window, and select Next.

  13. On the Installation Completed screen, select Finish.

  14. Log into the SAC and confirm that the system is working as expected.

    If necessary, restart YaST2 to correct settings.

Installing Red Hat Enterprise Linux (RHEL)

.

Initial Configuration of a RHEL 6 System Admin Controller (SAC)

This section describes how to configure Red Hat Enterprise Linux 6 on the system admin controller (SAC).

Procedure 3-5. Initial Configuration of a RHEL 6 SAC

    To perform the initial configuration of a RHEL6 SAC, perform the following steps:

    1. Add the IPADDR , NETMASK, and NETWORK values appropriate for the public (house) network interface to the /etc/sysconfig/network-scripts/ifcfg-eth0 file similar to the following example:

      IPADDR=128.162.244.88
      NETMASK=255.255.255.0
      NETWORK=128.162.244.0

    2. Create the /etc/sysconfig/network file similar to the following example:

      [root@localhost ~]# cat /etc/sysconfig/network
      NETWORKING=yes
      HOSTNAME=my-system-admin
      GATEWAY=128.162.244.1

    3. Add the IP address of the house network interface and the name(s) of the SAC to /etc/hosts file similar to the following example:

      # echo "128.162.244.88 my-system-admin.domain-name.mycompany.com my-system-admin" >> /etc/hosts

    4. Set the SAC hostname, as follows:

      # hostname my-system-admin

    5. Configure the /etc/resolv.conf file with your DNS server set up. Later in the cluster set up process, these name servers will be used as the defaults for the House DNS Resolvers you configure in a later configure-cluster command step. Setting this now allows you to register with RHN and allows you to access your house network to access any DVD images or other settings you need. You may choose to defer this step, but then you will need to also defer rhn_register. Here is an example resolv.conf:

      search mydomain.com
      nameserver 192.168.0.1
      nameserver 192.168.0.25

    6. Force the invalidation of the host cache of nscd with the nscd(8) command on the hosts file, as follows:

      # nscd -i hosts

    7. Restart the following services (in this order), as follows:

      # /etc/init.d/network restart
      # /etc/init.d/rpcbind start
      # /etc/init.d/nfslock start

    8. Set the local timezone. The timezone is set with /etc/localtime, a timezone definition file. The timezone defined in /etc/localtime can be determined, as follows:

      # strings /etc/localtime | tail -1
      CST6CDT,M3.2.0,M11.1.0

      Link the appropriate timezone file from directory /usr/share/zoneinfo to /etc/localtime. For example, set timezone to Pacific Time / Los Angeles, as follows:

      # /bin/cp -l /usr/share/zoneinfo/PST8PDT /etc/localtime.$$
      # /bin/mv /etc/localtime.$$ /etc/localtime

      Confirm the timezone, as follows:

      # strings /etc/localtime | tail -1
      PST8PDT,M3.2.0,M11.1.0
      

    9. (Conditional) Edit file /etc/ntp.conf to direct requests to the NTP server at your site.

      Complete the following steps if you use the RHEL operating system and you want to direct requests to your site's NTP server instead of to the public time servers of the pool.ntp.org project:

      • Use a text editor to open file /etc/ntp.conf .

      • Insert a pound character (#) into column 1 of each of each line that includes rhel.pool.ntp.org.


        Note: Do not edit or remove entries that serve the cluster networks.

        The following is an example of a correctly edited file:

        # Use public servers from the pool.ntp.org project.
        # Please consider joining the pool (http://www.pool.ntp.org/join.html).
        #server 0.rhel.pool.ntp.org
        #server 1.rhel.pool.ntp.org
        #server 2.rhel.pool.ntp.org
        server ntp.mycompany.com

      • Type the following command to restart the NTP server:

        # /etc/init.d/ntpd restart

    10. Make sure you have registered with the Red Hat Network (RHN). If you have not yet registered, run the following command:

      % /usr/bin/rhn_register

    11. Run the configure-cluster command. See “Running the Cluster Configuration Tool”.

    Running the Cluster Configuration Tool

    The cluster configuration tool enables you to configure, or reconfigure, your SGI ICE X system. The procedure in this topic explains the general, required configuration steps for all SGI ICE X systems. If your SGI ICE X system includes optional components, or if your site has specific requirements that require further customization, later procedures explain how to use the cluster configuration tool to create a more customized environment.

    The following are the required cluster configuration steps:

    • Create repositories for software installation files and updates.

    • Install the system admin node (SAC) cluster software.

    • Configure the cluster subdomain and examine other network settings. The cluster subdomain is likely to be different from the eth0 domain on the SAC itself.

    • Configure the NTP server.

    • Install the cluster's software infrastructure. This step can take 30 minutes.

    • Configure the house network's DNS resolvers.

    The following procedure explains how to perform the required configuration tasks.

    Procedure 3-6. To run the cluster configuration tool

    1. Locate your site's SGI software distribution DVDs or verify the path to your site's online software repository.

      You can install the software from either physical media or from an ISO on your network.

    2. From the VGA screen, or through an ssh connection, log into the system admin controller (SAC) as the root user.

      SGI recommends that you run the cluster configuration tool either from the VGA screen or from an ssh session to the system admin controller (SAC). Avoid running the configure-cluster command from a serial console.

    3. Type the following command to start the cluster configuration tool:

      # /opt/sgi/sbin/configure-cluster

    4. On the cluster configuration tool's Initial Configuration Check screen, select OK on the initial pop-up window.

      Figure 3-3 shows the initial screen and the pop-up.

      Figure 3-3. Initial Configuration Check Screen

      Initial Configuration Check Screen

      The cluster configuration tool recognizes a configured cluster. If you start the tool on a configured SGI ICE X system, it opens into the Main Menu.

    5. On the Initial Cluster Setup screen, select OK on the initial pop-up window.

      Figure 3-4 shows the initial screen and the pop-up.

      Figure 3-4. Initial Cluster Setup Screen with the initial pop-up window

      Initial Cluster Setup Screen with the initial
pop-up window

    6. On the Initial Cluster Setup screen, select R Repo Manager: Set Up Software Repos, and click OK.

      Figure 3-5 shows the Initial Cluster Setup screen with the task menu. This procedure guides you through the tasks you need to perform for each of the menu selections on the Initial Cluster Setup screen.

      Figure 3-5. Initial Cluster Setup Tasks Screen

      Initial Cluster Setup Tasks Screen

      The next few steps create software repositories for the initial installation packages and for updates. You need to create repositories for the following software:

      • The operating system software, either RHEL or SLES

      • SGI Foundation Suite

      • SGI Management Center (SMC) for SGI ICE X

      • Additional software packages for which you hold licenses, such as the Message Passing Toolkit, the SGI Performance Suite, and any others

    7. On the Repo Manager screen, respond to the pop-up windows as follows:

      • On the One or more ISOs were ... pop-up window, select Yes.

      • On the Repositories are created ... pop-up window, press Enter.

      • On the You will now be prompted ... pop-up window, select OK.

      • On the Would you like to register ... pop-up window, select Yes.

      • On the last pop-up window, specify whether you want to install from DVDs or from an ISO image.

    8. Perform one of the following step sequences to create the software repositories:

      • To create the repositories from DVDs, complete the following sequence of steps:

        1. Select Insert DVD.

        2. Insert a DVD.

        3. Select Mount inserted DVD.

        4. On the Media registered successfully with crepo ... screen, select OK, and eject the DVD.

        5. On the Would you like to register ... pop-up window, select Yes if you have more software that you need to register.

          If you select Yes, repeat the preceding this sequence for the next DVD.

          If you select No, proceed to the next step.

      • To create the repositories from ISO images, complete the following sequence of steps:

        1. Select Use custom path/URL.

        2. On the Please enter the full path to the mount point or the ISO file ... screen, type the full path in server_name: path_name/iso_file format. This field also accepts a URL or an NFS path. Select OK after typing the path.

        3. On the Media registered successfully with crepo ... screen, select OK.

        4. On the Would you like to register ... pop-up window, select Yes if you have more software that you need to register.

          If you select Yes, repeat the preceding tasks in this sequence for the next DVD.

          If you select No, proceed to the next step.

    9. On the Initial Cluster Setup screen, select I Install Admin Cluster Software, and select OK.

      This step installs the cluster software that you wrote to the repositories.

    10. On the Initial Cluster Setup screen, select N Network Settings, and select OK.

    11. On the Cluster Network Settings screen, select C Configure Subnet Addresses, and select OK.

    12. On the Warning: Changing the subnet IP addresses ... screen, click OK.

    13. Review the settings on the Subnet Network Addresses screen, and modify these settings only if absolutely necessary.

      Select either OK or Back if you accept the defaults.

      If your site has network requirements that conflict with the defaults, you need to change the network settings. On the Update Subnet Addresses screen, the Head Network field shows the SAC's IP address. SGI recommends that you do not change the IP address of the SAC or rack leader controllers (RLCs) if at all possible. You can change the IP addresses of the InfiniBand network ( IB0 and IB1) to match the IP requirements of the house network, and then select OK.

    14. On the Cluster Network Settings screen, select D Configure Cluster Domain Name, and select OK.

    15. On the Please enter the domain name for this cluster. pop-up window, type the domain name, and select OK.

      The domain you type becomes a subdomain to your house network..

      For example, type ice.americas.sgi.com.

    16. On the Cluster Network Settings screen, select Back.

    17. On the Initial Cluster Setup screen, select T Configure Time Client/Server (NTP), and select OK.

    18. On the pop-up window that begins with This procedure will replace your ntp configuration file ..., select Yes.

    19. On the pop-up window that begins with A new ntp.conf has been put in to position ..., select OK.

      On the subsequent screens, you set the SAC as the time server to the SGI ICE X system. The cluster configuration tool screens differ for the RHEL and SLES operating systems. On RHEL platforms, you return to the Initial Cluster Setup menu. On SLES platforms, use the SLES documentation to guide you through the NTP configuration screens.

    20. On the Initial Cluster Setup menu, select S Perform Initial Admin Node Infrastructure Setup, and select OK.

    21. On the pop-up window that begins with A script will now perform the initial cluster ..., select OK.

    22. On the Admin Infrastructure One Time Setup screen, in the Initial Cluster Setup Complete pop-up, select OK.

      This step runs a series of scripts that configure the SAC on the SGI ICE X system. The scripts also create the root images for the RLC, service, and compute nodes. The scripts run for approximately 30 minutes. At the end, the script issues a line that includes install-cluster completed in its output. This step takes about 30 minutes.

      The final output of the script is as follows:

      /opt/sgi/sbin/create-default-sgi-images Done!

      The output of the mksiimage commands are stored in a log file at the following location:

      /var/log/cinstallman

    23. On the Initial Cluster Setup menu, select D Configure House DNS Resolvers, and select OK.

      Figure 3-6 shows the Configure House DNS Resolvers screen.

      Figure 3-6. Configure House DNS Resolvers Screen

      Configure House DNS Resolvers Screen

      The system autopopulates the values on the Configure House DNS Resolvers screen to match the DNS specifications on the SAC. The DNS resolvers you specify here enable the service nodes to resolve host names on your network. You can set the DNS resolvers to the same name servers used on the SAC itself.

      Complete the following steps:

      1. Perform one of the following actions:

        • To accept these settings, select OK.

        • To change the settings, type in different IP addresses, and select OK.

        • To disable house network resolvers, select Disable House DNS.

      2. On the Setting DNS Forwarders to ... pop-up window, select Yes.

    24. On the Initial Cluster Setup complete pop-up window, select OK.

    25. On the Initial Cluster Setup screen, select Back.

    26. Proceed to “(Conditional) Customizing the Cluster Configuration”.

    (Conditional) Customizing the Cluster Configuration

    This topic explains how to use the cluster configuration tool to enable features for specific situations. The features you need to enable depend on your hardware platform's features and your site requirements. When you use the cluster configuration tool, you set global values that apply to all nodes that you discover after you set the value. When you perform an initial installation, none of the nodes have been discovered at this point in the configuration procedure, so all nodes you discover later are set the same. However, if you use the cluster configuration tool to change values on a system that is already configured, you might need to reset values on older, existing nodes that you configured previously. You can use commands to reset the values on previously configured nodes.

    Table 3-2 shows the cluster configuration tool Main Menu options and explains those that you need to configure.

    Table 3-2. Cluster Configuration Tool Main Menu Options

    Menu Selection

    Platform Notes

    For More Information

    B Configure Backup DNS Server

    Optional on all platforms.

    This menu option enables you to configure one of your service nodes as a DNS server. If the DNS on your system admin controller (SAC) is unavailable, the service node you configure here can act as a DNS server for your system.

    tbd

    M Configure Redundant Management Network

    Enable on all SGI ICE X platforms.

    On SGI ICE X and SGI ICE 8400 systems, the default is yes (enabled).

    On SGI ICE 8200 systems, the default is yes (enabled), but you need to set this to no (disabled).

    Enables a secondary network from the SAC, RLC, and service nodes to the cluster network.

    tbd

    S Configure Switch Management Network

    Default is no (disabled).

    Set to yes (enabled) on all SGI ICE X platforms.

    Accept the default (no ) on SGI ICE 8400 and SGI ICE 8200 platforms.

    This selection enables or disables link aggregation within the cluster. The Ethernet switch controls all VLANs and trunking within the cluster.

    tbd

    N Configure MCell Network

    Enable on all SGI ICE X platforms that are equipped with MCells.

    Enables the separate network that the MCells require.

    tbd

    Q Configure MySql Replication

    Default is yes (enabled).

    Enable on very large systems. When enabled, the cluster's MySQL database keeps the leader and server nodes synchronized.

    tbd

    U Configure Default Max Rack IRU Setting

    Verify the setting on all platforms.

    When set appropriately, it minimizes the time it takes to distribute software to the blades.

    tbd

    C Configure blademond rescan interval

    Optional on all platforms.

    Enables automatic blade discovery.

    tbd

    The following procedure explains how to use the cluster configuration tool to configure additional features on your SGI ICE system.

    Procedure 3-7. To configure system-wide values for specific configurations

    1. Log into the system admin controller (SAC) as the root user, and type the following command to start the cluster configuration tool:

      # /opt/sgi/sbin/configure-cluster

    2. Choose menu selections from the Cluster Configuration Tool: Main Menu that you want to configure for your system.

      Table 3-2 summarizes the menu items, and the following topics provide additional information:

    3. Proceed to the following:

      “Installing the SGI Management Center License Key”

    Configuring a Backup Domain Name System (DNS) Server

    When you configure a backup DNS, the compute nodes can use a service node as a secondary DNS server if the rack leader controller (RLC) is not available.

    The following procedure explains how to configure a service nodes to act as a DNS when the RLC is down or being serviced.

    Procedure 3-8. To enable a backup DNS

    1. (Conditional) Log into the system admin controller (SAC) as the root user, and type the following command to start the cluster configuration tool:

      # /opt/sgi/sbin/configure-cluster

      Perform this step only if the cluster configuration tool is not running at this time.

    2. (Optional) Type the following command to retrieve a list of available service nodes:

      # cnodes --service

    3. On the Main Menu screen, select B Configure Backup DNS Server (optional), and select OK.

    4. On the pop-up window that appears, type the identifier for the service node that you want to designate as the backup DNS, and select OK.

      Figure 3-7 shows how to specify service0 as the backup DNS.

      Figure 3-7. Configure Backup DNS Server (service node) pop-up window

      Configure Backup DNS Server (service node)
pop-up window

    To disable this feature, select Disable Backup DNS from the same menu and select Yes to confirm your choice. For information about how to use commands to enable or disable this feature, see the following:

    “Enabling or Disabling the Backup Domain Name Service (DNS) on an SGI ICE X Cluster” in Chapter 4

    Configuring a Redundant Management Network (RMN)

    An RMN is a secondary network from the nodes to the cluster network. The RMN is enabled by default for all platforms. On SGI ICE X systems and SGI ICE 8400 systems, make sure the RMN is enabled. On SGI ICE 8200 systems, make sure to disable the RMN.

    When an RMN is enabled for a node, the default Linux bonding mode for RLCs and service nodes is 802.3ad link aggregation. The RMN has the following additional characteristics:

    • The GigE switches are doubled in the system control network and stacked (using stacking cables).

    • The links from the chassis management controllers (CMCs) are doubled.

    • Some links from the system admin controller (SAC), rack leader controllers (RLCs), and most service nodes are doubled.

    • Baseboard management controller (BMC) connections are not doubled, which means that certain failures can cause temporary inaccessibility to the BMCs. During these failures, the host interfaces remain accessible.

    When you use the cluster configuration tool to configure an RMN, the system enables an RMN for all nodes that you discover after you enable the setting. If you have existing nodes in the cluster without an RMN, those existing nodes are not changed. The following procedure explains how to configure an RMN from the cluster configuration tool.

    Procedure 3-9. To enable the RMN from the cluster configuration tool

    1. (Conditional) Log into the system admin controller (SAC) as the root user, and type the following command to start the cluster configuration tool:

      # /opt/sgi/sbin/configure-cluster

      Perform this step only if the cluster configuration tool is not running at this time.

    2. On the Main Menu screen, select M Configure Redundant Management Network (optional), and select OK.

    3. On the pop-up window that appears, select Y yes (default), and select OK.

    For information about how to use commands to enable or disable the RMN, see the following:

    “Enabling or Disabling the Redundant Management Network (RMN)” in Chapter 4

    For a diagram that shows an RMN, see Figure 1-11. For information about link aggregation, see “Link Aggregation, Rack Leader Controllers (RLCs), and Service Nodes ”.

    Configuring a Switch Management Network

    On an SGI ICE X system, enable the switch management network. On SGI ICE 8400 and SGI ICE 8200 systems, disable the switch management network. If your cluster mixes SGI ICE X racks with either SGI ICE 8400 or SGI ICE 8200 racks, use the cluster configuration tool to enable the switch management network in the cluster.

    The system software attempts to set this value automatically, but you can use the procedures in this topic to verify or reset the value. In an SGI ICE X cluster, the switch management network enables the Ethernet switch to control all VLANs and trunking.

    The following procedure explains how to enable the switch management network.

    Procedure 3-10. To enable the switch management network from the cluster configuration tool

    1. (Conditional) Log into the system admin controller (SAC) as the root user, and type the following command to start the cluster configuration tool:

      # /opt/sgi/sbin/configure-cluster

      Perform this step only if the cluster configuration tool is not running at this time.

    2. On the Main Menu screen, select S Configure Switch Management Network (optional), and select OK.

    3. On the pop-up window that appears, select Y yes , and select OK.

      Figure 3-8 shows the selection pop-up window:

      Figure 3-8. Configure Switch Management Network pop-up window

      Configure Switch Management Network pop-up
window

    Configuring an MCell Network

    Perform the procedure in this topic if your SGI ICE X system includes MCells.

    The MCell network is the internal network that powers the MCell cooling system. The following procedure explains how to enable the MCell network.

    Procedure 3-11. To enable MCells from the cluster configuration tool

    1. (Conditional) Log into the system admin controller (SAC) as the root user, and type the following command to start the cluster configuration tool:

      # /opt/sgi/sbin/configure-cluster

      Perform this step only if the cluster configuration tool is not running at this time.

    2. On the Main Menu screen, select N Configure MCell Network (optional), and select OK.

    3. On the pop-up window that appears, select Y yes , and select OK.

    Configuring MySQL Database Server Replication

    SGI recommends that you enable MySQL replication on very large systems to keep the internal cluster database synchronized. The master MySQL database server resides on the system admin controller (SAC). When you enable replication, data from the master MySQL database server to be replicated to one or more MySQL database slaves on the rack leader controller (RLCs) and service nodes). If your site has a large number of racks, using this feature can reduce the amount of contention for database resources on the SAC.

    If the database becomes corrupt, you can disable replication during the debugging session and reenable it later.

    The following procedure explains how to enable MySQL database replication.

    Procedure 3-12. To enable MySQL database replication from the cluster configuration tool

    1. (Conditional) Log into the SAC as the root user, and type the following command to start the cluster configuration tool:

      # /opt/sgi/sbin/configure-cluster

      Perform this step only if the cluster configuration tool is not running at this time.

    2. On the Main Menu screen, select Q Configure MySQL Replication (optional), and select OK.

    3. On the pop-up window that appears, select Y yes , and select OK.

    When enabling or disabling this feature, the configure-cluster command will back up the database, save some system attributes, and call /etc/opt/sgi/conf.d/80-update-mysql on the SAC, RLCs, and service nodes.

    When replication is OFF and the cattr command is run by a script on an RLC or service node, it uses the database on the SAC. You can verify this, as follows:

    r1lead:~ # chkconfig -l mysql
        mysql                     0:off  1:off  2:off   3:off   4:off  5:off   6:off
    r1lead:~ # grep -e hostname /etc/opt/sgi/cattr.conf
        hostname = admin
    r1lead:~ # cattr list | grep my_sql_replication
        my_sql_replication      : no
    

    When replication is ON and cattr is run by a script on an RLC or service node, it uses the replicated database on the node itself. You can verify this, as follows:

    r1lead:~ # chkconfig -l mysql
      mysql                     0:off  1:off  2:on   3:on   4:off  5:on   6:off
    r1lead:~ # grep -e hostname /etc/opt/sgi/cattr.conf
      hostname = localhost
    r1lead:~ # cattr list | grep my_sql_replication
      my_sql_replication      : yes
    

    To verify if database replication is working on an RLC or service node, perform the following:

    sys-admin:~ # cadmin --show-replication-status --node {node}:  Show current value.

    See Chapter 15 of the MySQL 5.0 Reference Manual for detailed information regarding how replication is implemented and configured. This manual is available at http://dev.mysql.com/doc/refman/5.0/en/replication.html.

    Configuring the Default Maximum Rack Individual Rack Unit (IRU) Setting

    You can configure the maximum number of blade enclosures that an individual rack leader controller (RLC) can manage. When you set this to a value that is appropriate to your system size, it takes less time to distribute new software images to the blades in an enclosure. If you change this value, the system assigns the new value to any nodes that you discover.

    Procedure 3-13. To configure the default maximum IRU setting from the cluster configuration tool

    1. (Conditional) Log into the SAC as the root user, and type the following command to start the cluster configuration tool:

      # /opt/sgi/sbin/configure-cluster

      Perform this step only if the cluster configuration tool is not running at this time.

    2. On the Main Menu screen, select U Configure Default Max Rack IRU Setting (optional), and select OK.

    3. On the pop-up window that appears, type 8 or 4, and select OK.

      On SGI ICE X platforms, set to this value to 8. On SGI ICE 8200 or SGI ICE 8400 platforms, set this value to 4.

    4. (Conditional) Use the cadmin command to change the maximum number of IRUs managed by existing, configured RLCs.

      Perform this step if your system includes configured RLCs that have a different number of IRUs configured as their maximum.

      Type the following command to retrieve the current setting:

      # cadmin --show-max-rack-irus

    Configuring the blademond Rescan Interval

    When enabled, the system checks every two minutes for changes to the number of blades in the system. If you remove or add a new blade, the system automatically detects this change, updates the system, and integrates the change on the rack. By default, the interval between checks is set to 120, which is two minutes.

    Procedure 3-14. To configure the blademond rescan interval from the cluster configuration tool

    1. (Conditional) Log into the SAC as the root user, and type the following command to start the cluster configuration tool:

      # /opt/sgi/sbin/configure-cluster

      Perform this step only if the cluster configuration tool is not running at this time.

    2. On the Main Menu screen, select C Configure blademond rescan interval (optional), and select OK.

    3. On the pop-up window that appears, accept the default of 120, which is two minutes, and select OK.

      Alternatively, type a different value and select OK.

    4. On the Main Menu screen, select U Configure Default Max Rack IRU Setting (optional), and select OK.

    5. Visually inspect the pop-up window that appears and verify that the maximum IRU setting is appropriate for your system.

      On SGI ICE X platforms, set to this value to 8. On SGI ICE 8200 or SGI ICE 8400 platforms, set this value to 4.

      When the maximum IRU setting is configured correctly, the system manages the changes to your system more efficiently.

      For more information about this setting, see “Configuring the Default Maximum Rack Individual Rack Unit (IRU) Setting”.

    Installing the SGI Management Center License Key

    The SGI Management Center (SMC) software runs on the system admin controller (SAC). SMC provides a graphical user interface for system configuration, operation, and monitoring.

    For more information about using SMC, see SGI Management Center (SMC) System Administrator Guide.

    For more information about licensing, see the licensing FAQ on the following website:

    http://www.sgi.com/support/licensing/faq.html

    The following procedure explains how to obtain and install the license key for SMC.

    Procedure 3-15. To license the SMC software

    1. Use a text editor to open file /etc/lk/keys.dat.

    2. Copy and paste the license key string exactly as it was given to you.

    3. Save the file.

    4. Type the following command to restart the SMC daemon:

      # service mgr restart

    5. Proceed to the following:

      “Synchronizing the Software Repository, Installing Software Updates, and Cloning the Images”

    Synchronizing the Software Repository, Installing Software Updates, and Cloning the Images

    The following procedure explains how to update the software in the repositories that you created with the cluster configuration tool.

    Procedure 3-16. To update the software

    1. Log into the system admin controller (SAC), and type the following command to retrieve information about the network interface card (NIC) bonding method on the SAC:

      # cadmin --show-mgmt-bonding --node admin

      The command returns 802.3ad if bonding is set appropriately..

      If the command does not return 802.3ad, type the following commands to set the bonding appropriately, and reboot the system:

      # cadmin -set-mgmt-bonding -node admin 802.3ad
      # reboot

      discover_skip_switchconfig=YES. This setting is YES if manufacturing configured the switch. If you don't know the status, set this to NO.

      mgmt_bonding=802.3ad. On SGI ICE X SAC and service nodes, when set to active-backup, the system uses the first NIC card unless the first NIC card is unavailable. The system uses the second NIC until the first NIC can be used again.

    2. Type the following command to retrieve the new images:

      # sync-repo-updates

    3. Type the following command to install the images:

      # cinstallman --update-image

    4. Type the following command to create the new software images on the system:

      # create-default-sgi-images

    5. Proceed to the following:

    Configuring the Switches

    The discover command initializes and configures the system components for the SGI ICE system. You use the discover command to configure the SGI ICE system's management switches first, and then if you have MCells, you configure the MCell switches. After you configure the switches, you can configure the nodes.

    The following procedure is an overview of the switch configuration process.

    Procedure 3-17. To configure SGI ICE X switches

    1. Use one of the following procedures to configure the management switches:

      The procedures differ depending on whether you have a media access control (MAC) file or not. The MAC file shows the MAC addresses of the components in your environment. Switch discovery and configuration can complete more quickly if you obtain this file. Without this file, you need to power cycle each switch manually.

      The following is an example MAC file:

      r1lead 00:30:48:9e:f2:59 00:30:48:F2:7E:A2
      r2lead 00:25:90:01:4e:3c 00:25:90:01:6c:cc
      service0 00:25:90:00:3b:8f 00:25:90:01:6e:9e
      mgmtsw0 00:26:f3:c3:7a:40 00:26:f3:c3:7a:40
      

      The content of a MAC file is as follows:

      Column

      Content

      1

      The component's ID.

      2

      For nodes, column 2 contains the MAC address of baseboard management controller (BMC) on the component.

      For switches, column 2 contains the MAC address of the first network interface card (NIC), eth0. Switches do not have a BMC.

      3

      The MAC address of the first network interface card (NIC), eth0. For switches, columns two and three are identical in the MAC file.

    2. (Conditional) Use the following procedure to configure MCell switches:

      “(Conditional) Configuring MCell Switches”

    Configuring Management Switches With a MAC File

    interface ethernet 2/47
    no lacp
    shutdown
    switchport allowed vlan add 3 tagged
    interface ethernet 1/47
    no lacp
    switchport allowed vlan add 3 tagged
    lacp
    interface ethernet 2/47
    lacp
    no shutdown
    end
    from:http://linux.engr.sgi.com/wiki/index.php/CB3_Switch_Configuration#configure_ports_for_M-Cell

    The following procedure explains how to configure your switches when you have each switch's MAC information in a MAC file.

    Procedure 3-18. To configure switches -- with a MAC file

    1. Gather information about the switches in your SGI ICE X system.

      Visually inspect your system. Note the types of switches you have and their identifiers. At a minimum, you have one management switch. You might also have InfiniBand switches and management switches attached to MCells. In this procedure, you configure only the management switches.

    2. Log in as root to the system admin controller (SAC), and write the MAC file to a location on your SAC.

      For example, write it to /var/tmp/mac_file.

    3. Power-on all the management switches.

    4. Type the following command:

      # discover --mgmtswitch num --macfile path

      For num, type the identifier for one of the switches. Visually inspect the outside of each switch to determine its identifier.

      For path, type the full path to the location of the MAC file.

      For example:

      # discover --mgmtswitch 0 --macfile /var/tmp/mac_file

    5. Repeat the preceding command for each switch attached to your SGI ICE system.

    6. Type the following command to retrieve information about the switches you discovered, and examine the output to confirm that all switches are included:

      # cnodes --mgmtswitch

    7. After all switches have been discovered, proceed to the following:

    Configuring Management Switches Without a MAC File

    The following procedure explains how to configure your switches when you do not have the switch MAC information in a MAC file.

    Procedure 3-19. To configure switches -- without a MAC file

    1. Gather information about the switches in your SGI ICE X system.

      Visually inspect your system. Note the types of switches you have and their identifiers. At a minimum, you have one management switch. You might also have InfiniBand switches and MCell switches. In this procedure, you configure only the management switches.

    2. For each management switch stack, verify that only one cable goes from the first switch stack to the second switch stack.

      This cable should connect the master switch in the upper switch to the master switch in switch stack immediately below. Each switch can have a cable plugged into its slave switch, but make sure the cables that connect the slave switches to each other are unplugged. This prevents looping.

    3. Log in as the root user to the system admin controller (SAC), and type the following command:

      # discover --mgmtswitch switch_ID

      For switch_type, specify mgmtswitch or ibswitch.

      For example, the following command discovers management switch 0:

      # discover --mgmtswitch 0

    4. When prompted, connect the switch to a power source.

      The command discovers the MAC address of the switch after you connect the switch to a power source.


      Note: Do not power-on the switch at this time. Only connect the switch to a power source.


    5. Type the following command to save the MAC address to your MAC file:

      # discover --show-macfile > path

      For path, type the full path to the location of the MAC file. For example, /var/tmp/mac_file.

    6. Repeat the preceding steps for each switch that is attached to your SGI ICE system.

    7. Type the following command to retrieve information about the switches that you discovered:

      # cnodes --all
      

      If the output is very long, direct the output to a file that you can examine with a text editor. For example:

      # cnodes --all > switch_file

    8. After all switches have been discovered, proceed to the following:

    Discovering Cascaded LG-E Switches

    This section describes how to discover LG-E (LG-Ericsson) switches when cascading them.

    Procedure 3-20. Discovering Cascaded LG-E Switches

    When cascading LG-E switches, since the system topology uses stacking and one or more stack pairs are added to the configuration, to avoid looping, perform the following steps:

    1. Stack both switches but do not connect them to the stacked top level switches.

    2. Power on the switches and wait until one switch becomes the master and the other switch becomes a slave.

    3. Connect the master switch to a top level master switch, for example, from port 1/48 on the cascaded master switch to port 1/48 on the top level master switch.

    4. Use the discover command (see “discover Command”), to discover the switch pair. During the discover process, the switchConfig API is called; it will set Link Aggregation Control Protocol (LACP) for port 1/48 and 2/48 on the cascaded switches.

    5. Connect port 2/48 of the cascaded slave switch to port 2/48 on the top level slave switch.

    Since LACP is already configured, no loop is created.

      (Conditional) Configuring MCell Switches

      Perform the procedure in this topic if you have an SGI ICE X system that includes MCells.

      The following procedure explains how to configure the switches attached to the MCells.

      Procedure 3-21. To configure MCell switches

      1. Gather information about the MCell switches in your SGI ICE X system.

        Visually inspect your system. Note the switches identifiers, and note the port identifiers.

      2. Log in as the root user to the system admin controller (SAC), and type the following command:

        # switchconfig -s mgmtswnum -p port_num

        For num, type the ID number of the management switch to which the cooling distribution unit (CDU) is attached.

        For port_num, type the port number.

      3. Repeat the preceding step for each cooling distribution unit (CDU) and each cooling rack controller (CRC) attached to your system.

      4. After all switches have been discovered, proceed to the following:

      discover Command

      The discover command is used to discover rack leader controllers (RLCs) and service nodes (and their associated BMC controllers) in an entire system or in a set of one or more racks that you select. Rack numbers generally start at one. Service nodes generally start at zero. The discover command is also used to discover external InfiniBand switches and system management switches.


      Caution: It is best to discover system management switches prior to any other component. That is because, as you discover node types, the tool automatically reconfigures the switch to operate properly as it proceeds.


      When you use the discover command to perform the discovery operation on your SGI ICE X system, you will be prompted with instructions on how to proceed (see “Installing Software on the Rack Leader Controllers (RLCs) and Service Nodes”).

      When using the --delrack and --delservice options, the node is not removed completely from the database but it is marked with the administrative status NOT_EXIST. When you go to discover a node that previously existed, you now get the same IP allocations you had previously and the node is then marked with the administrative status of ONLINE. If you have a service node, for example, service0, that has a custom host name of "myhost" and you later go to delete service0 using the discover --delservice command, the host name associated with it will still be present. This can cause conflicts if you wish to reuse the custom host name " myhost" on a node other than service0 in the future. You can use the cadmin --db-purge --node service0 command that will remove the node entirely from the database (for more information, see “cadmin: SMC for SGI ICE X Administrative Interface ” in Chapter 4). You can then reuse the “myhost” name.

      There is a new hardware typed named generic. This hardware type has its MAC address discovered, but it is for devices that only have a single MAC address and do not need to be managed by SMC for SGI ICE X software. The likely usage scenario is Ethernet switches that extend the management network that are necessary in large SGI ICE X configurations.

      When the generic hardware type is used for external management switches on large SGI ICE X systems, the following guidelines should be followed:

      • The management switches should be the first hardware discovered in the system.

      • The management switches should both start with their power cords unplugged (analogous to how SMC for SGI ICE X discovers RLCs and service nodes).

      • The external switches can be given higher numbered service numbered if your site does not want them to take lower numbers.

      • You can also elect to give these switches an alternate host name using the cadmin command after discovery is complete.

      • Examples of using the discover command generic hardware type are, as follows:

        admin:~ # discover --service 98,generic
        admin:~ # discover --service 99,generic


      Note: When you use the discover command to discover an SGI XE500 service node, you must specify the hardware type. Otherwise, the serial console will not be set up properly. Use a command similar to the following:
      admin:~ # discover --service 1,xe500



      For a discover command usage statement, perform the following:

      [sys-admin ~]# discover --h
      Usage: discover [OPTION]...
      Discover lead nodes, service nodes, and external switches.
      
      Options:
        --delrack NUM[,FLAG]...        mark rack leaders as deleted
        --delservice NUM               mark a service node as deleted
        --delibswitch NUM              mark an external ib switch as deleted
        --delmgmtswitch NUM            mark a mgmt network switch as deleted
        --force                        avoid sanity checks that require input
        --ignoremac MAC                ignore the specified MAC address
        --macfile FILE                 read mac addresses from FILE
        --rack NUM[,FLAG]...           discover a specific rack or set of racks
        --rackset NUM,COUNT[,FLAG]...  discover count racks starting at #
        --service NUM[,FLAG]...        discover the specified service node
        --ibswitch NUM[,FLAG]...       discover the specified external ib switch
        --mgmtswitch NUM[,FLAG]...     discover the specified mgmt switch
        --show-macfile                 print output usable for --macfile to stdout
      
      Details:
        Any number of management switches, racks, service nodes, or external
        switches can be discovered in one command line.  Rack numbers generally
        start at 1, service nodes, management switches, and infiniband switches
        generally start at 0.  An existing node can be re-discovered by re-running
        the discover command. An easier way to simply re-image a node is by
        using the cinstallman command, see the --next-boot and --assign-image
        options.
      
        A comma searated set of optional FLAGs modify how discover proceeds for the
        associated node and sets it up for installation.  FLAGs can be used to
        specify hardware type, image, console device, etc.
      
        The 'generic' hardware type is for hardware that should be discovered but
        that only has one IP address associated with it.  Tempo will treat this
        hardware as an unmanaged service node.  An example use would be for the
        administrative interface of an ethernet switch being used for the Tempo
        management network. When this type is used, the generic hardware being
        discovered should be doing a DHCP request.
      
        The 'other' hardware type should be used for a service node which is not
        managed by Tempo.  This mode will allocate IPs for you and print them to
        the screen.  Since Tempo only prints IP addresses to the screen in this
        mode, the device being discovered does not even need to exist at the
        moment the operation is performed.
      
        The --macfile option can be used instead of discovering MACs by power cycling.
        All MACs to be discovered must be in the file.  External switches should
        simply repeat the same MAC twice in this file.  File format:
              Example file contents:
          r1lead 00:11:22:33:44:55 66:77:88:99:EE:FF
          service0 00:00:00:00:00:0A 00:00:00:00:00:0B
          extsw1 00:00:00:00:00:11 00:00:00:00:00:11
      
      Hardware Type Flags:
        altix4000 altix450 altix4700 default generic h2106-g7 ice-csn iss3500-intel
        kvm other uv10 xe210 xe240 xe250 xe270 xe310 xe320 xe340 xe500
      
      Switch Type Flags:
        voltaire-isr-9288 voltaire-isr-9096 voltaire-isr-9024 voltaire-isr-2004
        voltaire-isr-2012 voltaire4036 mellanox5030 mellanox5600
      
      
      Other Flags:
        image=IMAGE                   specify an alternate image to install
        console_device=DEVICE         use DEVICE for console
        net=NET                       ib0 or ib1, for external IB switches only
        type=TYPE                     leaf or spine, for external IB switches only
        redundant_mgmt_network=YESNO  yes or no, determines how network is configured
        switch_mgmt_network=YESNO     no if node is in an ICE8200/ICE8400 system
        mgmt_bonding=TYPE             type of bonding to use: active-backup or 802.3ad
        ha=all                        High Availabity solution for the rack (HA-RLC)
        ha=1                          the command applies for the HA-RLC #1
        ha=2                          the command applies for the HA-RLC #2
        only_bmc=YESNO                yes: only BMC discovered (but all IPs allocated)
        bt=YESNO                      yes: use bittorrent while imaging, default no
      
      Examples:
        Discover a top level management switch
          # discover --mgmtswitch 0
        You can later use the 'cadmin' command to give it a custom hostname if you
        so choose.
      
        Discover rack 1 and service node 0:
          # discover --rack 1 --service 0
      
        Discover service 0, using myimage and disabling redundnat_mgmt_network.
          # discover --service 0,image=myimage,redundant_mgmt_network=no
      
        Discover racks 1 and 4, service node 1, ignores MAC address 00:04:23:d6:03:1c:
          # discover --ignoremac 00:04:23:d6:03:1c --rack 1 --rack 4 --service 1
      
        Discover racks 1-5, service node 0-2, where service node 1 is Altix 450
        hardware and service node 2 is "other":
          # discover --rackset 1,5 --service 0,xe240 --service 1,altix450 --service 2,other
      
        Discover an external ib switch, corresponding to the voltaire-isr-9024
        hardware and IB0 fabric.
          # discover --ibswitch 0,voltaire-isr-9024,net=ib0,type=spine
        You can later use the 'cadmin' command to give it a custom hostname if you
        so choose.
      
        Discover a switch used to extend the Tempo management network - a generic
        device.
         # discover --service 99,generic
      
        Discover two leaders for rack 1 (High Availability):
          # discover --rack 1,ha=all
      
        Discover r1lead1 (High Availability):
          # discover --rack 1,ha=1
      
        Discover r1lead2 (High Availability):
          # discover --rack 1,ha=2
      
        Discover two leaders per rack for racks 1, 2, and 3 (High Availability):
          # discover --rackset 1,3,ha=all
      
        Delete r1lead1 (High Availability):
          # discover --delrack 1,ha=1
      
        Delete r1lead2 (High Availability):
          # discover --delrack 1,ha=2
      

      EXAMPLES

      Example 3-1. discover Command Examples

      The following examples walk you through some typical discover command operations.

      To discover a top level management switch, perform the following:

      admin:~ # /opt/sgi/sbin/discover --mgmtswitch 0 

      You can later use the cadmin command to give it a custom hostname if you so choose.

      To discover rack 1 and service node 0, perform the following:

      admin:~ # /opt/sgi/sbin/discover --rack 1 --service0,xe210

      In this example, service node 0 is an SGI Rackable C2108-TY10 system.

      To discover racks 1-5, and service node 0-2, perform the following:

      admin:~ # /opt/sgi/sbin/discover --rackset 1,5 --service0,c2108 --service 1,altix450 --service 2,other

      In this example, service node 1 is an Altix 450 system. Service node 2 is other hardware type.


      To discover service 0, but use service-myimage instead of service-sles11 (default), perform the following:

      admin:~ # /opt/sgi/sbin/discover --service0,image=service-myimage


      Note: You may direct a service node to image itself with a custom image later, without re-discovering it. See “cinstallman Command” in Chapter 4.


      To discover racks 1 and 4, service node 1, and ignore MAC address 00:04:23:d6:03:1c , perform the following:

      admin:~ # /opt/sgi/sbin/discover --ignoremac 00:04:23:d6:03:1c --rack 1 --rack 4 --service0

      The discover command supports external switches in a manner similar to racks and service nodes, except that switches do not have BMCs and there is no software to install. The syntax to add a switch is, as follows:

      admin:~ # discover --ibswitch name,hardware,net=fabric,type=spine

      where name can be any alphanumeric string, hardware is any one of the supported switch types (run discover --help to get a list), and net= fabric is either ib0 or ib1 , and type= is leaf or spine, for external IB switches only.

      An example command is, as follows:

      # discover --ibswitch extsw,voltaire-isr-9024,net=ib0,type=spine

      Once discover has assigned an IP address to the switch, it will call the fabric management sgifmcli command to initialize it with the information provided. The /etc/hosts and /etc/dhcpd.conf files should also have entries for the switch as named, above. You can use the cnodes --ibswitch command to list all such nodes in the cluster.

      To remove a switch, perform the following:

      admin:~ # discover --delibswitch name

      where name is that of a previously discovered switch.

      An example command is, as follows:

      admin:~ # discover --delibswitch extsw

      When your are discovering a node, you can use an additional option to turn on or off the redundant management network for that node. For example:

      admin:~ # discover --service0,xe500,redundant_mgmt_network=no

      Discover a switch used to extend the SMC for ICE X management network, a generic device, as follows:

      admin:~ # discover --service 99,generic

      Configuring the Rack Leader Controllers (RLCs) and Service Nodes

      The discover command identifies and configures the RLCs and service nodes on the SGI ICE X system.

      Installing SMC for ICE X System Admin Controller (SAC) Software

      .

      cmcdetectd Daemon

      The cmcdetectd daemon runs on the system admin controller (SAC). When it sees a chassis management controller (CMC) asking for an IP address, it looks at the client ID of the request. That client ID contains the rack number and slot number. The cmcdetectd daemon provides this information to the switchConfig application programming interface (API) and configures the top level switch fabric.

      The cmcdetectd daemon performs the following:

      • The cmcdetectd daemon really only starts working when at least one management switch is discovered.

      • It configures the switch and "moves" the CMCs to the appropriate VLAN.

      • If there are two switches in the switch stack, cmcdetectd through the switchConfig API configures the CMC ports for manual trunking (the CMC-0 and CMC-1 ports on the physical CMC).

      • Once moved, the dynamic host configuration protocol (DHCP) requests are directed to the rack VLAN for a given rack and detected by the rack leader controller (RLC) when it is discovered.

      • If the cmcdetectd daemon detects any SGI ICE X CMCs, it automatically sets to true the switch management network variable to true (see “Configuring a Switch Management Network”).

      • If you install a second slot or re-install a system, the switch is already configured and the cmcdetectd daemon does not see the requests any more. It is a good practice to manually configure the switch management network setting using the configure-cluster option.

      Installing Software on the Rack Leader Controllers (RLCs) and Service Nodes

      The discover command, described in “discover Command”, sets up the RLC and managed service nodes for installation and discovery. This section describes the discovery process you use to determine the Media Access Control (MAC) address, that is, the unique hardware address, of each RLC and then how to install software on the RLCs.


      Note: When RLCs and service nodes come up and are configured to install themselves, they determine which Ethernet devices are the integrated ones by only accepting DHCP leases from SMC for SGI ICE X. They then know that the interface they got a lease from must be an integrated Ethernet device. This is facilitated by using a DHCP option code. SMC for SGI ICE X uses option code 149 by default. In rare situations, a house network DHCP server could be configured to use this option code. In that case, nodes that are connected to the house network could misinterpret a house DHCP server as being a SMC for SGI ICE X one and auto detect the interface incorrectly. This would lead to an installation failure.

      To change the dhcp option code number used for this operation, see the cadmin --set-dhcp-option option. The --show-dhcp-option will show the current value. For more information on the using the cadmin command, see “cadmin: SMC for SGI ICE X Administrative Interface ” in Chapter 4.


      Procedure 3-22. Installing Software on the RLCs and Service Nodes

        To install software on the RLCs, perform the following steps:

        1. Use the discover command from the command line, as follows:

          # /opt/sgi/sbin/discover --rack 1
          


          Note: You can discover multiple racks at a time using the --rackset option. Service nodes can be discovered with the --service option.


          The discover script executes. When prompted, turn the power on to the node being discovered and only that node.


          Note: Make sure you only power on the node being discovered and nothing else in the system. Make sure not to power the system up itself.


          When the node has electrical power, the BMC starts up even though the system is not powered on. The BMC does a network DHCP request that the discover script intercepts and then configures the cluster database and DHCP with the MAC address for the BMC. The BMC then retrieves its IP address. Next, this script instructs the BMC to power up the node. The node performs a DHCP request that the script intercepts and then configures the cluster database and DHCP with the MAC address for the node. The RLC installs itself using the systemimager software and then boots itself.

          The discover script will turn on the chassis identify light for 2 minutes. Output similar to the following appears on the console:

          Discover of rack1 / leader node r1lead complete
          r1lead has been set up to install itself using systemimager
          The chassis identify light has been turned on for 2 minutes

        2. The blue chassis identify light is your cue to power on the next RLC and start the process all over.

          You may watch install progress by using the console command. For example, console r1lead connects you to the console of the r1lead so that you can watch installation progress. The sessions are also logged. For more information on the console command, see “Console Management” in Chapter 4.

        3. Using the identify light, you can configure all the RLCs and service nodes in the cluster without having to go back and fourth to and from your workstation between each discovery operation. Just use the identify light on the node that was just discovered as your cue to move to the next node to plug in.

        4. Shortly after the discover command reports that discovery is complete for a given node, that node installs itself. If you supplied multiple nodes on the discover command line, it is possible multiple nodes could be in different stages of the imaging/installation process at the same time. When the RLC boots up for the first time, one process it starts is the blademond process. This process discovers the IRUs and attached blades and sets them up for use. The blademond process is described in “Configuring the blademond Rescan Interval”, including which files to watch for progress and includes a blademond --help statement.

          Some sites choose to turn the blademond daemon off and only enable it periodically. The blademond --scan-once option allows you to easily run blademond once from the command line and watch the output.


          Note: You would never run blademond --scan-once command if blademond is already running as a daemon on your system.


          If your discover process does not find the appropriate BMC after a few minutes, the following message appears:

          ==============================================================================
          Warning: Trouble discovering the BMC!
          ==============================================================================
          3 minutes have passed and we still can't find the BMC we're looking for.
          We're going to keep looking until/if you hit ctrl-c.
          
          Here are some ideas for what might cause this:
          
            - Ensure the system is really plugged in and is connected to the network.
            - This can happen if you start discover AFTER plugging in the system.
              Discover works by watching for the DHCP request that the BMC on the system
              makes when power is applied.  Only nodes that have already been discovered
              should be plugged in.  You should only plug in service and leader nodes
              when instructed.
            - Ensure the CMC is operational and passing network traffic.
            - Ensure the CMC firwmare up to date and that it's configured to do VLANs.
            - Ensure the BMC is properly configured to use dhcp when plugged in to power.
            - Ensure the BMC, frusdr, and bios firmware up to date on the node.
            - Ensure the node is connected to the correct CMC port.
          
          Still Waiting.   Hit ctrl-c to abort this process.  That will abort discovery
          at this problem point -- previously discovered components will not be affected.
          ==============================================================================

          If your discover process finds the appropriate BMC, but cannot find the RLC or service node that is powered up after a few minutes, the following message appears:

          ==============================================================================
          Warning: Trouble discovering the NODE!
          ==============================================================================
          4 minutes have passed and we still can't find the node.
          We're going to keep looking until/if you hit ctrl-c.
          
          If you got this far, it means we did detect the BMC earlier,
          but we never saw the node itself perform a DHCP request.
          
          Here are some ideas for what might cause this:
          
           - Ensure the BIOS boot order is configured to boot from the network first
           - Ensure the BIOS / frusdr / bmc firmware are up to date.
           - Is the node failing to power up properly? (possible hardware problem?)
             Consider manually pressing the front-panel power button on this node just
             in case the ipmitool command this script issued failed.
           - Try connecting a vga screen/keyboard to the node to see where it's at.
           - Is there a fault on the node?  Record the error state of the 4 LEDs on the
             back and contact SGI support.  Consider moving to the next rack in the mean
             time, skippnig this rack (hit ctrl-c and re-run discover for the other
             racks and service nodes).
           
          Still Waiting.   Hit ctrl-c to abort this process.  That will abort discovery
          at this problem point -- previously discovered components will not be affected.
          ==============================================================================

        5. You are now ready to discover and install software on the compute blades in the rack. For instructions, see “Discovering Compute Nodes”.

        Discovering Compute Nodes

        This section describes how to discover compute nodes in your SGI ICE X system. The blademond daemon that runs on the rack leader controllers (RLCs) calls the discover-rack command to discover a rack and integrate new compute nodes (blades). For more information, see “Configuring the blademond Rescan Interval”.

        Procedure 3-23. Discovering Compute Nodes

          To discover compute nodes (blades) in your SGI ICE X system, perform the following:

          1. Complete the steps in “Installing Software on the Rack Leader Controllers (RLCs) and Service Nodes”.

          2. For instructions on how to configure, start, verify, or stop the InfiniBand Fabric management software on your SGI ICE X system, see Chapter 5, “System Fabric Management”.


          Note: The InfiniBand fabric does not automatically configure itself. For information on how to configure and start up the InfiniBand fabric, see Chapter 5, “System Fabric Management”.


          Service Node Discovery, Installation, and Configuration

          Service nodes are discovered and deployed similar to rack leader controllers (RLCs). The discover command, with the --service related commands, allow you to discover service nodes in the same discover operation that discovered the RLCs.

          Like RLCs, the service node is automatically installed. The service node image associated with the given service node is used for installation.

          Service nodes have one, or possibly two, Ethernet connection(s) to the SGI ICE network. Service nodes may also be connected to your house network. Typically, interfaces with lower numbers are connected to the SGI ICE X network (for example, eth0, or eth0 and eth1), and any remaining network interfaces are used to connect to the house network.

          The firstboot system setup script does not start automatically on the system console after the first boot after installation (unlike the system admin controller (SAC)).

          Use YAST to set up the public/house network connection on the service node, as follows:

          • Select the interface which is connected to your house network to configure in firstboot (for example, eth1 or eth2).

          • If you change the default host name, you need to make sure that the cluster service name is still resolvable as tools depend on that.

          • Name service configuration is handled by the system admin controller (SAC) and RLCs. Therefore, service node resolv.conf files need to always point to the SAC and RLCs in order to resolve cluster names. If you wish to resolve host names on your house network, use the configure-cluster command to configure the house name servers. The SAC and RLCs will then be able to resolve your house network addresses, in addition to the internal cluster hostnames. Besides, the cluster configuration update framework may replace your resolv.conf file anyway when cluster configuration adjustments are made.

            Do not change resolv.conf and do not configure different name servers in yast.

          In some rare cases, it is possible that your house networks uses the the same DHCP option identifier as the SMC for SGI ICE X systems software. In this case, two events could happen:

          • The imaging client could get a DHCP lease from the your house network DHCP server.

          • Imaging could fail because it cannot reach the SAC.

          The SMC for SGI ICE X DHCP option identifier is 149, as shown by the cadmin command:

          admin:~ # cadmin  --show-dhcp-option
          149 

          You can use the cadmin --set-dhcp-option {value} option, to change the SMC for SGI ICE X DHCP option identifier so it is different from your house network. For more information on the cadmin command, see “cadmin: SMC for SGI ICE X Administrative Interface ” in Chapter 4.

          InfiniBand Configuration

          Before you start configuring the InfiniBand network, you need to ensure that all hardware components of the cluster have been discovered successfully, that is, system admin controller (SAC), rack leader controller (RLC), service and compute nodes. You also need to be finished with the cluster configuration steps in “Running the Cluster Configuration Tool”.

          Sometimes, InfiniBand switch monitoring errors can appear, before the InfiniBand network has been fully configured. To disable InfiniBand switch monitoring, perform the following command:

          % cattr set disableIbSwitchMonitoring true     

          To configure the InfiniBand network, start the configure-cluster command again on the system admin controller (SAC). Since the Initial Setup has been done already, you can now use the Configure InfiniBand Fabric option to configure the InfiniBand fabric as shown in Figure 3-9.

          Figure 3-9. Configure InfiniBand Fabric from Cluster Configuration Tool

          Configure InfiniBand Fabric from Cluster Configuration
Tool

          Select the Configure InfiniBand Fabric option, the InfiniBand Fabric Management tool appears, as shown in Figure 3-10.

          Figure 3-10. InfiniBand Management Tool Screen

          InfiniBand Management Tool
Screen

          Use the the online help available with this tool to guide you through the InfiniBand configuration. After configuring and bringing up the InfiniBand network, select the Administer InfiniBand ib0 option or the Administer InfiniBand ib1 option, the Administer InfiniBand screen appears as shown in Figure 3-11. Verify the status using the Status option.

          Figure 3-11. Administer InfiniBand GUI

          Administer InfiniBand
GUI

          The Status option returns information similar to the following:
          Master SM
          Host = r1lead
          Guid = 0x0002c9030006938b
          Fabric = ib0
          Topology = hypercube
          Routing Engine = dor
          OpenSM = running

          Press the Enter key to return to the configure-cluster GUI.

          Configuring the Service Node

          This section describes how to configure a service node and covers the following topics:

          Link Aggregation, Rack Leader Controllers (RLCs), and Service Nodes

          Link aggregation is initiated for an SGI ICE X cluster when one of the following events occurs:

          • The Configure Switch Management Network option on the Cluster Configuration Tool: Main Menu is used to configure the switch management network (see “Configuring a Switch Management Network”).

          • The cmcdetectd daemon finds SGI ICE X CMCs and turns link aggregation on automatically.

          There are two potential types of interface bonding in play on an SGI ICE X and the default is link aggregation 802.3ad. The two bonding modes are, as follows:

          • Active-backup, which is what we used in SGI ICE 8400 in redundant management network setups

          • Link aggregation 802.3ad (The SGI ICE X default)

          In order for link aggregation 802.3ad to work properly, the target service node must have a BMC with a dedicated Ethernet connection. If the BMC is in-band, sharing the physical wire with eth0, when link aggregation is negoiated, the BMC will disappear.

          In order to force active-backup bonding and avoid this problem, you can perform one of the following:

          • Use the discover command with the mgmt_bonding=active-backup flag

          • In the case where a node is already installed, use the cadmin command with the --set-mgmt-bonding option.

          You can also use the cadmin command to set link aggregation 802.3ad on the system admin controller (SAC) to increase bandwith capacity between the SAC, the RLC, and service nodes.

          Service Node Configuration for NAT

          You may want to reach network services outside of your SGI ICE X system. For this type of access, SGI recommends using Network Address Translation (NAT), also known as IP Masquerading or Network Masquerading. Depending on the amount of network traffic and your site needs, you may want to have multiple service nodes providing NAT services.

          Procedure 3-24. Service Node Configuration for NAT

            To enable NAT on your service node, perform the following steps:

            1. Use the configuration tools provided on your service node to turn on IP forwarding and enable NAT/IP MASQUERADE.

              Specific instructions should be available in the third-party documentation provided for your storage node system. Additional documentation is available at /opt/sgi/docs/setting-up-NAT/README. This document describes how to get NAT working for both IB interfaces.


              Note: This file is only on the service node. You need to # ssh service0 and then from service0 # cd /opt/sgi/docs/setting-up-NAT .


            2. Update all of the compute node images with default route configured for NAT.

              SGI recommends a script on the system admin controller (SAC) in /opt/sgi/share/per_host_customization/global/sgi-static-routes that can customize the routes based upon rack, IRU, and slot of the compute blade. Some examples are available in that script.

            3. Use the cimage --push-rack command to propagate the changes to the proper location for compute nodes to boot. For more information on using the cimage command, see “cimage Command” in Chapter 4 and “Customizing Software On Your SGI ICE X System” in Chapter 4.

            4. Use the cimage --set command to select the image.

            5. Reboot/reset the compute nodes using that desired image.

            6. Once the service node(s) has NAT enabled, is attached to an operational house network, and the compute nodes are booted from an image which sets their routing to point at the service node, test the NAT operation by using the ping(8) command to ping known IP addresses on the house network from an interactive session on the compute blade.

            7. See the troubleshooting discussion that follows.

            Troubleshooting Service Node Configuration for NAT

            Troubleshooting can become very complex. The first steps are to determine that the service node(s) are correctly configured for the house network and can ping the house IP addresses. Good choices are house name servers possibly found in the /etc/resolv.conf or /etc/name.d.conf files on the system admin controller (SAC). Additionally, the default gateway addresses for the service node may be a good choice. You can use the netstat -rn command for this information, as follows:

            system-1:/ # netstat -rn
            Kernel IP routing table
            Destination     Gateway         Genmask         Flags   MSS Window  irtt Iface
            128.162.244.0   0.0.0.0         255.255.255.0   U         0 0          0 eth0
            172.16.0.0      0.0.0.0         255.255.0.0     U         0 0          0 eth1
            169.254.0.0     0.0.0.0         255.255.0.0     U         0 0          0 eth0
            172.17.0.0      0.0.0.0         255.255.0.0     U         0 0          0 eth1
            127.0.0.0       0.0.0.0         255.0.0.0       U         0 0          0 lo
            0.0.0.0         128.162.244.1   0.0.0.0         UG        0 0          0 eth0

            If the ping command executed from the service node to the selected IP address gets responses, network monitoring tools such as tcpdump(1) should be used. On the service node, monitor the eth1 interface and simultaneously in a separate session monitor the ib[01] interface. You should specify monitoring specific-enough to not have additional noise then attempt execute a ping command from the compute node.

            Example 3-2. tcpdump Command Examples

            tcpdump -i eth1 ip proto ICMP # Dump ping packets on the public side of service node.
            tcpdump -i ib1 ip proto ICMP # Dump ping packets on the IB fabric side of service node.
            tcpdump -i eth1 port nfs # Dump NFS traffic on the eth1 side of service node.
            tcpdump -i ib1 port nfs # Dump NFS traffic on the eth1 side of service node.

            If packets do not reach the service nodes respective IB interface, perform the following:

            • Check the SAC's compute image configuration of the default route.

            • Verify that this image has been pushed to the compute nodes.

            • Verify that the compute nodes have booted with this image.

            If the packets reach the service nodes IB interface, but do not exit the eth1 interface, verify the NAT configuration on the service node.

            If the packets exit the eth1 interface, but replies do not return, verify the house network configuration and that IP masquerading is properly configured so that the packets exiting the interface appear to be originating from the service node and not the compute node.

            Using External DNS for Compute Node Name Resolution

            You may want to configure service node(s) to act as NAT gateways for your cluster (see “Service Node Configuration for NAT”) and to have the host names for the compute nodes in the cluster resolve through external DNS servers.

            You need to reserve a large block of IP addresses on your house network. If you configure to resolve via external DNS, you need to do it for both the ib0 and ib1 networks, for all node types. In other words, ALL -ib* addresses need to be provided by external DNS. This includes compute nodes, rack leader controllers (RLCs), and service nodes. Careful planning is required to use this feature. Allocation of IP addresses will often require assistance from a network administrator of your site.

            Once the IP addresses have been allocated on the house network, you need to tell the SMC for SGI ICE X software the IP addresses of the DNS servers on the house network that the SMC for SGI ICE X software can query for hostname resolution.

            To do this, use the configure-cluster tool (see “Running the Cluster Configuration Tool”). The menu item that handles this operation is Configure External DNS Masters (optional), as shown in Figure 3-12.

            Figure 3-12. Configure External DNS Masters Option Screen

            Configure External DNS Masters Option Screen

            From the Configure External DNS Master(s) screen, click the Yes button, as shown in Figure 3-13.

            Figure 3-13. Configure External DNS Master(s) Screen

            Configure External DNS Master(s) Screen

            Some important considerations are, as follows:

            • It is important to note that if you choose to use external DNS, you need to make this change before discovering anything. The change is not retroactive. If you have already discovered some nodes, then turn on external DNS support, the IP addresses assigned by SMC for SGI ICE X for the nodes already discovered will remain.

            • This is an optional feature that only a small set of customers will need to use. It should not be used by default.

            • This feature only makes sense if the compute nodes can reach the house network. This is not the default case for SGI ICE X systems.

            • It is assumed that you have already configured a service node to act as a NAT gateway to your house network (see “Service Node Configuration for NAT”) and that the compute nodes have been configured to use that service node as their gateway.

            Service Node Configuration for DNS

            For information on setting up DNS, see Figure 3-6.

            Service Node Configuration for NFS

            Assuming the installation has either NAT or Gateway operations configured on one or more service nodes, the compute nodes can directly mount the house NFS server's exports (see the exports(5) man page).

            Procedure 3-25. Service Node Configuration for NFS

              To allow the compute nodes to directly mount the house NFS server's exports, perform the following steps:

              1. Edit file /opt/sgi/share/per_host_customization/global/sgi-fstab on the system admin controller (SAC), or edit an image-specific script.

              2. Add the mount point, push the image, and reset the node.

              3. The server's export should get mounted. If it is not, use the technique for troubleshooting outlined in “Troubleshooting Service Node Configuration for NAT”.

              Service Node Configuration for NIS for the House Network

              This section describes two different ways to configure NIS for service nodes and compute blades when you want to use the house network NIS server, as follows:

              • NIS with the compute nodes directly accessing the house NIS infrastructure

              • NIS with a service node as a NIS slave server to the house NIS master

              The first approach would be used in the case where a service node is configured with network address translation (NAT) or gateway operations so that the compute nodes can access the house network directly.

              The second approach may be used if the compute nodes do not have direct access to the house network.

              Procedure 3-26. NIS with Compute Nodes Directly Accessing the House NIS Infrastructure

                To setup NIS with the compute nodes directly accessing the house NIS infrastructure, perform the following steps:

                1. In this case, you do not have to set up any additional NIS servers. Instead, each service node and compute node should be configured to bind to the existing house network servers. The nodes should already have the ypbind package installed. The following steps should work with most Linux distributions. You may need to vary them slightly to meet your specific needs.

                2. For service nodes, the instructions are very similar to those found in “Setting Up a SLES Service Node as a NIS Client”.

                  The only difference is that you should configure yp.conf to look at the IP address of your house network NIS server and not the rack leader controller (RLC) as is described in the sections listed, above.

                Procedure 3-27. NIS with a Service Node as a NIS Slave Server to the House NIS Master

                  To setup NIS with a service node as a NIS slave server to the house NIS master, perform the following:

                  1. Any service nodes that are NOT acting as an NIS slave server can be pointed at the existing house network NIS servers as described in Procedure 3-26. This is because they have house interfaces.

                  2. One (or more) service node(s) should be then be configured as NIS slave server(s) to the existing house network NIS Master server.

                    Since SGI can not anticipate what operating system or release the house network NIS Master server is running, no suggestions on any configuration you need to do to tell it that you are adding new NIS slave servers can be offered.

                  Setting Up an NFS Home Server on a Service Node for Your SGI ICE X System

                  These section describes how to make a service node an NFS home directory server for the compute nodes.


                  Note: Having a single, small server provide filesystems to the whole SGI ICE X system could create network bottlenecks that the hierarchical design of SGI ICE X is meant to avoid, especially if large files are stored there. Consider putting your home filesystems on an NAS file server. For instructions on how to do this, see “Service Node Configuration for NFS ”.


                  The instructions in this section assume you are using the service node image provided with the SMC for SGI ICE X software. If you are using your own installation procedures or a different operating system, the instructions will not be exact but the approach is still appropriate.


                  Note: The example below specifically avoids using /dev/sdX style device names. This is because /dev/sdX device names are not persistent and may change as you adjust disks and RAID volumes in your system. In some situations, you may assume /dev/sda is the system disk and that /dev/sdb is a data disk; this is not always the case. To avoid accidental destruction of your root disk, follow the instructions given below.


                  When you are choosing a disk, please consider the following:

                  To pick a disk device, first find the device that is being currently used as root. Avoid re-partitioning the installation disk by accident. To find which device is being used for root, use this command:

                  # ls -l /dev/disk/by-label/sgiroot
                  lrwxrwxrwx 1 root root 10 2008-03-18 04:27 /dev/disk/by-label/sgiroot ->
                  ../../sda2

                  At this point, you know the sd name for your root device is sda.

                  SGI suggests you use by-id device names for your data disk. Therefore, you need to find the by-id name that is NOT your root disk. To do that, use ls command to list the contents of /dev/disk/by-id, as follows:

                  # ls -l /dev/disk/by-id
                  total 0
                  lrwxrwxrwx 1 root root  9 2008-03-20 04:57 ata-MATSHITADVD-RAM_UJ-850S_HB08_020520 -> ../../hdb
                  lrwxrwxrwx 1 root root  9 2008-03-20 04:57 scsi-3600508e000000000307921086e156100 -> ../../sda
                  lrwxrwxrwx 1 root root 10 2008-03-20 04:57 scsi-3600508e000000000307921086e156100-part1 -> ../../sda1
                  lrwxrwxrwx 1 root root 10 2008-03-20 04:57 scsi-3600508e000000000307921086e156100-part2 -> ../../sda2
                  lrwxrwxrwx 1 root root 10 2008-03-20 04:57 scsi-3600508e000000000307921086e156100-part5 -> ../../sda5
                  lrwxrwxrwx 1 root root 10 2008-03-20 04:57 scsi-3600508e000000000307921086e156100-part6 -> ../../sda6
                  lrwxrwxrwx 1 root root  9 2008-03-20 04:57 scsi-3600508e0000000008dced2cfc3c1930a -> ../../sdb
                  lrwxrwxrwx 1 root root 10 2008-03-20 04:57 scsi-3600508e0000000008dced2cfc3c1930a-part1 -> ../../sdb1
                  lrwxrwxrwx 1 root root  9 2008-03-20 09:57 usb-PepperC_Virtual_Disc_1_0e159d01a04567ab14E72156DB3AC4FA -> ../../sr0

                  In the output, above, you can see that ID scsi-3600508e000000000307921086e156100 is in use by your system disk because it has a symbolic link pointing back to ../../sda. So do not consider that device.nThe other disk in the listing has ID scsi-3600508e0000000008dced2cfc3c1930a and happens to be linked to /dev/sdb.

                  Therefore, you know the by-id name you should use for your data is /dev/disk/by-id/scsi-3600508e0000000008dced2cfc3c1930a because it is not connected with sda, which we found in the first ls example happened to be the root disk.

                  Partitioning, Creating, and Mounting Filesystems

                  Procedure 3-28. Partitioning and Creating Filesystems for an NFS Home Server on a Service Node

                    The following example uses /dev/disk/by-id/scsi-3600508e0000000008dced2cfc3c1930a ID as the empty disk on which you will put your data. It is very important that you know this for sure. In “Setting Up an NFS Home Server on a Service Node for Your SGI ICE X System”, an example is provided that allows you to determine where your root disk is located so you can avoid accidently destroying it. Remember, in some cases, /dev/sdb will be the root drive and /dev/sda or /dev/sdc may be the data drive. Please confirm that you have selected the right device, and use the persistent device name to help prevent accidental overwriting of the root disk.


                    Note: Steps 1 through 7 of this procedure are performed on the service node. Steps 8 and 9 are performed from the system admin controller (SAC).


                    To partition and create filesystems for an NFS home server on a service node, perform the following steps:

                    1. Use the parted(8) utility, or some other partition tool, to create a partition on /dev/disk/by-id/scsi-3600508e0000000008dced2cfc3c1930a . The following example makes one filesystem out of the disk. You can use the parted utility interactively or in a command-line driven manner.

                    2. Make a new msdos label, as follows:

                      # # parted /dev/disk/by-id/scsi-3600508e0000000008dced2cfc3c1930a mkpart primary ext2 0 249GB
                      Information: Don't forget to update /etc/fstab, if necessary.

                    3. Find the size of the disk, as follows:

                      # # parted /dev/disk/by-id/scsi-3600508e0000000008dced2cfc3c1930a print
                      Disk geometry for /dev/sdb: 0kB - 249GB
                      Disk label type: msdos
                      Number  Start   End     Size    Type      File system  Flags
                      Information: Don't forget to update /etc/fstab, if necessary. 
                      

                    4. Create a partition that spans the disk, as follows:

                      # # parted /dev/disk/by-id/scsi-3600508e0000000008dced2cfc3c1930a mkpart
                      primary ext2 0 249GB
                      Information: Don't forget to update /etc/fstab, if necessary.  

                    5. Issue the following command to cause the /dev/disk/by-id partition device file is in place and available for use with the mkfs command that follows:

                      # udevtrigger

                    6. Create a filesystem on the disk. You can choose the filesystem type.


                      Note: The mkfs.ext3 command takes more than 10 minutes to create a single 500GB filesystem using default mkfs.ext3 options. If you do not need the number of inodes created by default, use the -N option to mkfs.ext3 or other options that reduce the number of inodes. The following example creates 20 million inodes. XFS filesystems can be created in much shorter time.


                      An ext3 example is, as follows:
                      # mkfs.ext3 -N 20000000 /dev/disk/by-id/scsi-3600508e0000000008dced2cfc3c1930a-part1

                      An xfs example is, as follows:
                      # mkfs.xfs /dev/disk/by-id/scsi-3600508e0000000008dced2cfc3c1930a-part1 

                    7. Add the newly created filesystem to the server's fstab file and mount it. Ensure that the new filesystem is exported and that the NFS service is running, as follows:

                      1. Append the following line to your /etc/fstab file.

                        /dev/disk/by-id/scsi-3600508e0000000008dced2cfc3c1930a-part1       /home   ext3    defaults        1       2


                        Note: If you are using XFS, replace ext3 with xfs. This example uses the /dev/disk/by-id path for the device and not a /dev/sd device.


                      2. Mount the new filesystem (the fstab entry, above, enables it to mount automatically the next time the system is rebooted), as follows:

                        # mount -a

                      3. Be sure the filesystem is exported. Add the following line to /etc/exports file. Adjust this line to match your site's access policies.

                        /home *(no_subtree_check,rw,async,no_root_squash)


                      4. Note: In some distros, the NFS server init script is simply "nfs"


                        Make sure the NFS server service is enabled. For SLES, use these commands:
                        # chkconfig nfsserver on
                        # /etc/init.d/nfsserver restart


                      Note: Steps 8 and 9 are performed from the system admin controller (SAC).


                    8. The following steps describe how to mount the home filesystem on the compute nodes, as follows:


                      Note: SGI recommends that you always work on clones of the SGI-supplied compute image so that you always have a base to copy to fall back to if necessary. For information on cloning a compute node image, see “Customizing Software Images” in Chapter 4.


                      1. Make a mount point in the blade image. In the following example, /home already is a mount point. If you used a different mount point, you need to do something similar to the following on the SAC. Note that the rest of the examples will resume using /home.

                        # mkdir /var/lib/systemimager/images/compute-sles11-clone/my-mount-point

                      2. Add the /home filesystem to the compute nodes. SGI supplies an example script for managing this. You just need to add your new mount point to the sgi-fstab post-host-customization script.

                      3. Use a text editor to edit the following file:

                        /opt/sgi/share/per-host-customization/global/sgi-fstab

                      4. Insert the following line just after the tmpfs and devpts lines in the sgi-fstab file:

                        service0-ib1:/home  /home           nfs     hard            0       0


                        Note: In order to maximize performance, SGI advises that the ib0 fabric be used for all MPI traffic. The ib1 fabric is reserved for storage related traffic.


                      5. Use the cimage command to push the update to the rack leader controllers (RLCs) serving each compute node, as follows:

                        # cimage --push-rack compute-sles11-clone "r*"

                        Using --push-rack on an image that is already on the RLC has the simple affect of updating them with the change you made above. For more information on using the cimage, see “cimage Command” in Chapter 4.

                    9. When you reboot the compute nodes, they will mount your new home filesystem.

                    For information on centrally managed user accounts, see “Setting Up a NIS Server for Your SGI ICE X System”. It describes NIS master set up. In this design, the master server residing on the service node provides the filesystem and the NIS slaves reside on the RLCs. If you have more than one home server, you need to export all home filesystems on all home servers to the server acting as the NIS master. You also need to export the filesystems to the NIS master using the no_root_squash exports flag.

                    Home Directories on NAS

                    If you want to use NAS server for scratch storage or make home filesystems available on NAS, you can follow the instructions in “Setting Up an NFS Home Server on a Service Node for Your SGI ICE X System”. In this example, you need to replace service0-ib1 with the ib1 InfiniBand host name for the NAS server and you need to know where on the NAS server the home filesystem is mounted to craft the sgi-fstab script properly.

                    RHEL Service Node House Network Configuration

                    If you plan to put your service node on the house network, you need to configure it for networking. For this, you may use the system-config-network command. It is better to use the graphical version of the tool if you are able. Use the ssh -X command from your desktop to connect to the system admin controller (SAC) and then again to connect to the service node. This should redirect graphics over to your desktop.

                    Some helpful hints are, as follows:

                    • On service nodes, the cluster interface is eth0 . Therefore, do not configure this interface as it is already configured for the cluster network.

                    • Do not make the public interface a dhcp client as this can overwrite the /etc/resolv.conf file.

                    • Do not configure name servers, the name server requests on a service node are always directed to the rack leader controller (RLC) for resolution. If you want to resolve network addresses on your house network, just be sure to enable the House DNS Resolvers using configure-cluster command on the system admin controller (SAC).

                    • Do not configure or change the search order, as this again could adjust what cluster management has placed in the /etc/resolv.conf file.

                    • Do not change the host name using the RHEL tools. You can change the hostname using the cadmin tool on the SAC.

                    • After configuring your house network interface, you can use the ifupethX command to bring the interface up. Replace X with your house network interface.

                    • If you wish this interface to come up by default when the service node reboots, be sure ONBOOT is set to yes in /etc/sysconfig/network-scripts/ifcfg-ethX (again, replace X with the proper value). The graphical tool allows you to adjust this setting while the text tool does not.

                    • If you happen to wipe out the resolv.conf file by accident and end up replacing it, you may need to issue this command to ensure that DNS queries work again:

                      # nscd --invalidate hosts

                    Setting Up a NIS Server for Your SGI ICE X System

                    This section describes how to set up a network information service (NIS) server running SLES 11 for your SGI ICE X system. If you would like to use an existing house network NIS server, see “Service Node Configuration for NIS for the House Network”. This section covers the following topics:

                    Setting Up a NIS Server Overview

                    In the procedures that follow in this section, here are some of the tasks you need to perform and system features you need to consider:

                    • Make a service node the NIS master

                    • Make the rack leader controllers (RLSs) the NIS slave servers

                    • Do not make the system admin controller (SAC) the NIS master because it may not be able to mount all of the storage types. Having the storage mounted on the NIS master server makes it far less complicated to add new accounts using NIS.

                    • If multiple service nodes provide home filesystems, the NIS master should mount all remote home filesystems. They should be exported to the NIS master service node with the no_root_squash export option. The example in the following section assumes a single service node with storage and that same node is the NIS master.

                    • No NIS traffic goes over the InfiniBand network.

                    • Compute node NIS traffic goes over Ethernet, not InfiniBand, by way of using a the lead-eth server name in the yp.conf file. This design feature prevents NIS traffic from affecting the InfiniBand traffic between the compute nodes.

                    Setting Up a SLES Service Node as a NIS Master

                    This section describes how to set up a service node as a NIS master. This section only applies to service nodes running SLES.

                    Procedure 3-29. Setting Up a SLES Service Node as a NIS master

                      To set up a SLES service node as a NIS master, from the service node, perform the following steps:


                      Note: These instructions use the text-based version of YaST. The graphical version of YaST may be slightly different.


                      1. Start up YaST, as follows:

                        # yast nis_server

                      2. Choose Create NIS Master Server and click on Next to continue.

                      3. Choose an NIS domain name and place it in the NIS Domain Name window. This example, uses ice.

                        1. Select This host is also a NIS client.

                        2. Select Active Slave NIS server exists .

                        3. Select Fast Map distribution.

                        4. Select Allow changes to passwords .

                        5. Click on Next to continue.

                      4. Set up the NIS master server slaves.


                        Note: You are now in the NIS Master Server Slaves Setup. Just now, you can enter the already defined rack leader controllers (RLSs) here. If you add more RLCs or re-discover RLCs, you will need to change this list. For more information, see “Tasks You Should Perform After Changing a Rack Leader Controller (RLC)”.


                      5. Select Add and enter r1lead in the Edit Slave window. Enter any other RLCs you may have just like above. Click on Next to continue.

                      6. You are now in NIS Server Maps Setup . The default selected maps are okay. Avoid using the hosts map (not selected by default) because can interfere with SGI ICE X system operations. Click on Next to continue.

                      7. You are now in NIS Server Query Hosts Setup. Use the default settings here. However, you may want to adjust settings for security purposes. Click on Finish to continue.

                        At this point, the NIS master is configured. Assuming you checked the This host is also a NIS client box, the service node will be configured as a NIS client to itself and start yp ypbind for you.

                      Setting Up a SLES Service Node as a NIS Client

                      This section describes how to use YaST to set up your other service nodes to be broadcast binding NIS clients. This section only applies to service nodes running SLES11.


                      Note: You do not do this on the NIS Master service node that you already configured as a client in “Setting Up a SLES Service Node as a NIS Master”.


                      Procedure 3-30. Setting Up a SLES Service Node as a NIS Client

                        To set up a service node as a NIS client, perform the following steps:

                        1. Enable ypbind, perform the following:

                          # chkconfig ypbind on 

                        2. Set the default domain (already set on NIS master). Change ice (or whatever domain name you choose above) to be the NIS domain for your SGI ICE X system, as follows:

                          # echo "ice" > /etc/defaultdomain

                        3. In order to ensure that no NIS traffic goes over the IB network, SGI does not recommend using NIS broadcast binding on service nodes. You can list a few rack leader controllers (RLCs) the in /etc/yp.conf file on non-NIS-master service nodes. The following is an example /etc/yp.conf file. Add or remove RLCs as appropriate. Having more entries in the list allows for some redundancy. If r1lead is hit by excessive traffic or goes down, ypbind can use the next server in the list as its NIS server. SGI does not suggest listing other service nodes in yp.conf file because all resolvable names for service nodes on service nodes use IP addresses that go over the InfiniBand network. For performance reasons, it is better to keep NIS traffic off of the InfiniBand network.

                          ypserver r1lead
                          ypserver r2lead

                        4. Start the ypbind service, as follows:

                          # rcypbind start

                          The service node is now bound.

                        5. Add the NIS include statement to the end of the password and group files, as follows:

                          # echo "+:::" >> /etc/group
                          # echo "+::::::" >> /etc/passwd
                          # echo "+" >> /etc/shadow

                        Setting up a SLES Rack Leader Controller (RLC) as a NIS Slave Server and Client

                        This section provides two sets of instructions for setting up rack leader controllers (RLCs) as NIS slave servers. It is possible to make all these adjustments to the RLC image in /var/lib/systemimager/images . Currently, SGI does not recommend using this approach.


                        Note: Be sure the InfiniBand interfaces are up and running before proceeding because the RLC gets its updates from the NIS Master over the InfiniBand network. If you get a "can't enumerate maps from service0" error, check to be sure the InfiniBand network is operational.


                        Procedure 3-31. Setting up an RLC as a NIS Slave Server and Client

                          Use the following set of commands from the system admin controller (SAC) to set up an RLC as a NIS slave server and client.


                          Note: Replace ice with your NIS domain name and service0 with the service node you set up as the master server.


                          admin:~ # cexec --head --all chkconfig ypserv on
                          admin:~ # cexec --head --all chkconfig ypbind on
                          admin:~ # cexec --head --all chkconfig portmap on
                          admin:~ # cexec --head --all chkconfig nscd on
                          admin:~ # cexec --head --all rcportmap start
                          admin:~ # cexec --head --all "echo ice > /etc/defaultdomain"
                          admin:~ # cexec --head --all "ypdomainname ice"
                          admin:~ # cexec --head --all "echo ypserver service0 > /etc/yp.conf"
                          admin:~ # cexec --head --all /usr/lib/yp/ypinit -s service0
                          admin:~ # cexec --head --all rcportmap start
                          admin:~ # cexec --head --all rcypserv start
                          admin:~ # cexec --head --all rcypbind start
                          admin:~ # cexec --head --all rcnscd start

                          Setting up the SLES Compute Nodes to be NIS Clients

                          This section describes how to set up the compute nodes to be NIS clients. You an configure NIS on the clients to use a server list that only contains the their rack leader controller (RLC). All operations are performed from the system admin controller (SAC).

                          Procedure 3-32. Setting up the Compute Nodes to be NIS Clients

                            To set up the compute nodes to be NIS clients, perform the following steps:

                            1. Create a compute node image clone. SGI recommends that you always work with a clone of the compute node images. For information on how to clone the compute node image, see “Customizing Software Images” in Chapter 4.

                            2. Change the compute nodes to use the cloned image/kernel pair, as follows:

                              admin:~ # cimage --set compute-sles11-clone 2.6.16.46-0.12-smp "r*i*n*"

                            3. Set up the NIS domain, as follows ( ice in this example):

                              admin:~ # echo "ice" > /var/lib/systemimager/images/compute-sles11-clone/etc/defaultdomain

                            4. Set up compute nodes to get their NIS service from their RLC (fix the domain name as appropriate), as follows:

                              admin:~ # echo "ypserver lead-eth" > /var/lib/systemimager/images/compute-sles11-clone/etc/yp.conf

                            5. Enable the ypbind service, using the chroot command, as follows:

                              admin:~# chroot /var/lib/systemimager/images/compute-sles11-clone chkconfig ypbind on

                            6. Set up the password, shadow, and group files with NIS includes, as follows:

                              admin:~# echo "+:::" >> /var/lib/systemimager/images/compute-sles11-clone/etc/group
                              admin:~# echo "+::::::" >> /var/lib/systemimager/images/compute-sles11-clone/etc/passwd
                              admin:~# echo "+" >> /var/lib/systemimager/images/compute-sles11-clone/etc/shadow

                            7. Push out the updates using the cimage command, as follows:

                              admin:~ # cimage --push-rack compute-sles11-clone "r*"

                            NAS Configuration for Multiple IB Interfaces

                            The NAS cube needs to get configured with each InfiniBand fabric interface in a separate subnet. These fabrics will be separated from each other logically, but attached to the same physical network. For simplicity, this guide assumes that the -ib1 fabric for the compute nodes has addresses assigned in the 10.149.0.0/16 network. This guide also assumes the lowest address the cluster management software has used is 10.149.0.1 and the highest is 10.149.1.3 (already assigned to the NAS cube).

                            For the NAS cube, you need to configure the large physical network into four, smaller subnets, each of which would be capable of containing all the nodes and service nodes. It will have subnets 10.149.0.0/18 , 10.149.64.0/18, 10.149.128.0/18 , and 10.149.192.0/18.

                            After the discovery of the storage node has happened, SGI personnel will need to log onto the NAS box and change the network settings to use the smaller subnets, and then define the other three adapters with the same offset within the subnet; for example: Initial configuration of the storage node had set ib0 fabric's IP to 10.149.1.3 netmask 255.255.0.0. After the addresses are changed, ib0=10.149.1.3:255.255.192.0, ib1=10.149.65.3:255.255.192.0 , ib2=10.149.129.3:255.255.192.0, ib3=10.149.193.3:255.255.192.0 . The NAS cube should now have all four adapter connections connected to the fabric with IP addresses which can be pinged from the service node.


                            Note: The service nodes and the rack leads will remain in the 10.149.0.0/16 subnet.


                            For the compute blades, log into the system admin controller (SAC) and modify /opt/sgi/share/per-host-customization/global/sgi-setup-ib-configs file. Following the line iruslot=$1, insert:

                            # Compute NAS interface to use
                            IRU_NODE=`basename ${iruslot}`
                            RACK=`cminfo --rack`
                            RACK=$(( ${RACK} - 1 ))
                            IRU=`echo ${IRU_NODE} | sed -e s/i// -e s/n.*//`
                            NODE=`echo ${IRU_NODE} | sed -e s/.*n//`
                            POSITION=$(( ${IRU} * 16 + ${NODE} ))
                            POSITION=$(( ${RACK} * 64 + ${POSITION} ))
                            NAS_IF=$(( ${POSITION} % 4 ))
                            NAS_IPS[0]="10.149.1.3"
                            NAS_IPS[1]="10.149.65.3"
                            NAS_IPS[2]="10.149.129.3"
                            NAS_IPS[3]="10.149.193.3"

                            Then following the line $iruslot/etc/opt/sgi/cminfo add:

                            IB_1_OCT12=`echo ${IB_1_IP} | awk -F "." '{ print $1 "." $2 }'`
                            IB_1_OCT3=`echo ${IB_1_IP} | awk -F "." '{ print $3 }'`
                            IB_1_OCT4=`echo ${IB_1_IP} | awk -F "." '{ print $4 }'`
                            IB_1_OCT3=$(( ${IB_1_OCT3} + ${NAS_IF} * 64 ))
                            IB_1_NAS_IP="${IB_1_OCT12}.${IB_1_OCT3}.${IB_1_OCT4}"

                            Then change the IPADDR='${IB_1_IP}' and NETMASK='${IB_1_NETMASK}' lines to the following:

                            IPADDR='${IB_1_NAS_IP}'
                            NETMASK='255.255.192.0'

                            Then add the following to the end of the file:

                            # ib-1-vlan config
                            cat << EOF >$iruslot/etc/sysconfig/network/ifcfg-vlan1
                            # ifcfg config file for vlan ib1
                            BOOTPROTO='static'
                            BROADCAST=''
                            ETHTOOL_OPTIONS=''
                            IPADDR='${IB_1_IP}'
                            MTU=''
                            NETMASK='255.255.192.0'
                            NETWORK=''
                            REMOTE_IPADDR=''
                            STARTMODE='auto'
                            USERCONTROL='no'
                            ETHERDEVICE='ib1'
                            EOF
                            if [ $NAS_IF -eq 0 ]; then
                                rm $iruslot/etc/sysconfig/network/ifcfg-vlan1
                            fi

                            To update the fstab for the compute blades, edit /opt/sgi/share/per-host-customization/global/sgi-fstab file. Perform the equivalent steps as above to add the # Compute NAS interface to use section into this file. Then to specify mount points, add lines similar to the following example:

                            # SGI NAS Server Mounts
                            ${NAS_IPS[${NAS_IF}]}:/mnt/data/scratch     /scratch nfs    defaults 0 0

                            Creating User Accounts

                            The example used in this section assumes that the home directory is mounted on the NIS Master service and that the NIS master is able to create directories and files on it as root. The following example use command line commands. You could also create accounts using YaST.

                            Procedure 3-33. Creating User Accounts on a NIS Server

                              To create user accounts on the NIS server, perform the following steps:

                              1. Log in to the NIS Master service node as root.

                              2. Issue a useradd command similar to the following:

                                # useradd -c "Joe User" -m -d /home/juser juser

                              3. Provide the user a password, as follows:

                                # passwd juser

                              4. Push the new account to the NIS servers, as follows:

                                # cd /var/yp && make

                              Tasks You Should Perform After Changing a Rack Leader Controller (RLC)

                              If you add or remove an RLC, for example, if you use discover command to discover a new rack of equipment, you will need to configure the new RLC to be a NIS slave server as described in “Setting Up a SLES Service Node as a NIS Client”.

                              In addition, you need to add or remove the RLC from the /var/yp/ypservers file on NIS Master service node. Remember to use the -ib1 name for the RLC, as service nodes cannot resolve r2lead style names. For example, use r2lead-ib1.

                              # cd /var/yp && make

                              Installing SMC for SGI ICE Patches and Updating SGI ICE Systems

                              This section describes how to update the software on an SGI ICE system.


                              Note: To use the Subscription Management Tool (SMT) and run the sync-repo-updates script you must register your system with Novell using Novell Customer Center Configuration . This is in the Software category of YaST (see “Register with Novell ” and “Configuring the SMT Using YaST”).


                              Overview of Installing SMC for SGI ICE Patches

                              SGI supplies updates to SMC for SGI ICE software via the SGI update server at https://update.sgi.com/ . Access to this server requires a Supportfolio login and password. Access to SUSE Linux Enterprise Server updates requires a Novell login account and registration.

                              The initial installation process for the SGI ICE system set up a number of package repositories in the /tftpboot directory on the system admin controller (SAC). The SMC for SGI ICE related packages are in directories located under the /tftpboot/sgi directory. For SUSE Linux Enterprise Linux 11 (SLES11), they are in /tftpboot/distro/sles11.

                              When SGI releases updates, you may run sync-repo-updates (described later) to download the updated packages that are part of a patch. The sync-repo-updates command automatically positions the files properly under /tftpboot.

                              Once the local repositories contain the updated packages, it is possible to update the various SGI ICE X system admin controller (SAC), rack leader controller (RLC), and managed service node images using the cinstallman command. The cinstallman command is used for all package updates including those within images, running nodes, including the SAC itself.

                              There is a small amount of preparation required, in order to setup an SGI ICE system, so that updated packages can be downloaded from the SGI update server and the Linux distro server and then installed with the cinstallman command.

                              This following sections describe these steps, as follows:

                              Update the Local Package Repositories on the System Admin Controler (SAC)

                              This section explains how to update the local product package repositories needed to share updates on all of the various nodes on an SGI ICE X system.

                              Mirroring Distribution Updates

                              In order to keep your system up to date, there are various methods for getting package updates to your SGI ICE X system.

                              SGI has integrated software updates with the distribution update tools provided by SLES and RHEL. However, this integration only works if the software distribution is the same as the current distribution running on the system admin controller (SAC). For example, it's difficult for a SLES 11 system to get Red Hat updates from RHN. Below, you will find a description of managing package updates when the distro installed on the SAC matches the rest of the system. Finally, some ideas will be presented for SGI ICE X systems that have a mix of distributions available.

                              Update the SGI Package Repositories on the System Admin Controller (SAC)

                              SGI provides a sync-repo-updates script to help keep your local package repositories on the SAC synchronized with available updates for the SMC for SGI ICE X, SGI Foundation, SGI Performance Suite, and SLES products. The script is located in /opt/sgi/sbin/sync-repo-updates on the SAC.

                              The sync-repo-updates script requires your Supportfolio user name and password. You can supply this on the command line or it will prompt you for it. With this login information, the script contacts the SGI update server and downloads the updated packages into the appropriate local package repositories.

                              For SLES, if you installed and configured the SMT tool as described in “SLES System Admin Controllers (SACs): Update the SLES Package Repository ”, the sync-repo-updates script will also download any updates to SLES from the Novell update server. When all package downloads are complete, the script updates the repository metadata.

                              Once the script completes, the local package repositories on the SAC should contain the latest available package updates and be ready to use with the cinstallman command.

                              The sync-repo-updates script operates on all repositories, not just the selected reposistory.


                              Note: You can use the crepo command to set up custom repositories. If you add packages to these custom repositories later, you need to use the yume --prepare --repo command on the custom repository so that the metadata is up to date. Run the cinstallman --yum-node --node admin clean all command and then the yum/yume/cinstallman command.


                              SLES System Admin Controllers (SACs): Update the SLES Package Repository

                              In 1.8 (or later), SLES updates are mirrored to the SAC using the SUSE Linux Enterprise Subscription Management Tool. The Subscription Management Tool (SMT) is used to mirror and distribute updates from Novell. SMC for SGI ICE software only uses the mirror abilities of this tool. Mechanisms within SMC for SGI ICE are used to deploy updates to installed nodes and images. SMT is described in detail in the SUSELinux Enterprise Subscription Management Tool Guide. A copy of this manual is in the SMT_en.pdf file located in the /usr/share/doc/manual/sle-smt_en directory on the SAC. Use the scp(1) command to copy the manual to a location where you can view it, as follows:

                              admin :~ # scp /usr/share/doc/manual/sle-smt_en/SMT_en.pdf user@domain_name.mycompany.com:

                              Register with Novell

                              Register your system with Novell using Novell Customer Center Configuration. This is in the Software category of YaST. When registering, use the email address that is already on file with Novell. If there is not one on file, use a valid email address that you can associate with your Novell login at a future date.

                              The SMT will not be able to subscribe to the necessary update channels unless it is configured to work with a properly authorized Novell login. If you have an activation code or if you have entitlements associated with your Novell login, the SMT should be able to access the necessary update channels.

                              More information on how to register, how to find activation codes, and how to contact Novell with questions about registration can be found in the YaST help for Novell Customer Center Configuration.

                              Configuring the SMT Using YaST

                              At this point, your system admin controller (SAC) should be registered with Novell. You should also have a Novell login available that is associated with the SAC. This Novell login will be used when configuring the SMT described in this section. If the Novell login does not have proper authorization, you will not be able to register the appropriate update channels. Contact Novell with any questions on how to obtain or properly authorize your Novell login for use with the SMT.

                              Procedure 3-34. Configuring SMT Using YaST


                                Note: In step 8, a window pops up asking you for the Database root password. View the file /etc/odapw. Enter the contents of that file as the password in the blank box.


                                To configure SMT using YaST, perform the following steps:

                                1. Start up the YaST tool, as follows:

                                  admin:~ # yast

                                2. Under Network Services, find SMT Configuration

                                3. For Enable Subscription Management Tool Service (SMT), check the box.

                                4. For NU User, enter your Novell user name.

                                5. For NU Password, enter your Novell password.


                                  Note: It is the mirror credentials you want. You can have a login that gets updates but cannot mirror the repository.


                                6. For NU E-Mail, use the email with which you registered.

                                7. For your SMT Server URL, just leave the default.

                                  It is a good idea to use the test feature. This will at least confirm basic functionality with your login. However, it does not guarantee that your login has access to all the desired update channels.

                                  Note that Help is available within this tool regarding the various fields.

                                8. When you click Next, a window pops up asking for the Database root password. View the file /etc/odapw. Enter the contents of that file as the password in the blank box.

                                  A window will likely pop up telling you that you do not have a certificate. You will then be given a chance to create the default certificate. Note that when that tool comes up, you will need to set the password for the certificate by clicking on the certificate settings.

                                Setting up SMT to Mirror Updates

                                This section describes how to set up SMT to mirror the appropriate SLES updates.

                                Procedure 3-35. Setting up SMT to Mirror Updates

                                  To set up SMT to mirror updates, from the system admin controller (SAC), perform the following steps:

                                  1. Refresh the list of available catalogs, as follows:

                                    admin:~ # smt-ncc-sync

                                  2. Look at the available catalogs, as follows:

                                    admin:~ # smt-catalogs

                                    In that listing, you should see that the majority of the catalogs matching the SAC distribution (distro) sles11) have "Yes" in the "Can be Mirrored" column.

                                  3. Use the smt-catalogs -m command to show you just the ones that you are allowed to mirror.

                                  4. From the Name column, choose the entities with the ending of -Updates matching channels matching the installed distro. For example, if the base distro is SLES11, you might choose:

                                    SLE11-SMT-Updates
                                    SLE11-SDK-Updates
                                    SLES11-Updates

                                  5. This step shows how you might enable the catalogs. Each time, you will be presented with a menu of choices. Be sure to select only the x86_64 version and if given a choice between sles and sled, choose sles , as follows:

                                    admin:~ # smt-catalogs -e SLE11-SMT-Updates
                                    admin:~ # smt-catalogs -e SLE11-SDK-Updates
                                    admin:~ # smt-catalogs -e SLES11-Updates

                                    In the example, above, select 7 because it is x86_64 and sles, the others are not.

                                  6. Use the smt-catalogs -o comand to only show the enabled catalogs. Make sure that it shows the channels you need to be set up for mirroring.


                                    Warning: SMC for SGI ICE does not map the concept of channels on to its repositories. This means that any channel you subscribe to will have its RPMs placed into the distribution repository. Therefore, only subscribe the SMC for SGI ICE X SAC to channels related to your SMC for SGI ICE X cluster needs.


                                  Downloading the Updates from Novell and SGI

                                  At this time, you should have your update channels registered. From here on, the sync-repo-updates script will do the rest of the work. That script will use SMT to download all the updates and position those updates in to the existing repositories so that the various nodes and images can be upgraded.

                                  Run /opt/sgi/sbin/sync-repo-updates script.

                                  After this completes, you need to update your nodes and images (see “Installing Updates on Running System Admin Controller (SAC), Rack Leader Controller (RLC), and Service Nodes ”).


                                  Note: Be advised that the first sync with the Novell server will take a very long time.


                                  RHEL System Admin Controller (SAC): Update the RHEL Package Repository

                                  This section describes how to keep your packages up to date on RHEL-based SACs. The general idea is that we download all updates in to the RHEL repository, and then use SGI Management Center for ICE tools to deploy the updates to nodes and images.

                                  Perform the following:

                                  • Register with RHN. This can be done, as follows:

                                    # rhn_register

                                  • Once registered, the sync-repo-updates command will synchronize the latest version of update packages in to the RHEL 6 repository on the system.

                                     # sync-repo-updates

                                  Update Distros That Do Not Match the System Admin Controller (SAC)

                                  In situations where you have software distributions (distros) present that do not match the distro installed on the SAC, you have to arrange to download the updates on your own.

                                  SLES

                                  The instructions provided earlier show how to set up Novell SMT for the system admin controller (SAC). You could use similar ideas to configure your own SMT server somewhere on your network. Once the RPMs are staged on that server, you can copy them to the SAC using rsync or some other similar transport method. Remember to update the repository metadata after you update the packages. For example:

                                  # yume --prepare --repo /tftpboot/distro/sles11sp1

                                  RHEL

                                  You can register with RHN on a RHEL server on your network. Then, you can look at the /opt/sgi/sbin/sync-repo-updates script to see how it stages the packages (search for RHN in that file). Following that example, you can set up a server on your house network to stage the files. Then you can copy the staged packages to the system admin controller (SAC) into the matching distro repository. You need to update the repository metadata after copying packages using yume in a way similar to this:

                                  # yume --prepare --repo /tftpboot/distro/rhel6.0


                                  Note: You can always make a managed service node provide the function of staging the updates.


                                  Installing Updates on Running System Admin Controller (SAC), Rack Leader Controller (RLC), and Service Nodes

                                  This section explains how to update existing nodes and images to the latest packages in the repositories.

                                  To install updates on the SAC, perform the following command from the SAC:

                                  admin:~ # cinstallman --update-node --node admin

                                  To install updates on all online RLCs, perform the following command from the SAC:

                                  admin:~ # cinstallman --update-node --node r\*lead

                                  To install updates on all managed and online service nodes, perform the following from the SAC:

                                  admin:~ # cinstallman --update-node --node service\*

                                  To install updates on the SAC, all online RLCs, and all online and managed service nodes with one command, perform the following command from the SAC:

                                  admin:~ # cinstallman --update-node --node \*

                                  Please note the following:

                                  • The cinstallman command does not operate on running compute nodes. For compute nodes, it is an image management tool only. You can use it to create and update compute images and use the cimage command to push those images out to RLCs (see “cimage Command” in Chapter 4).

                                    For managed service nodes and RLCs, you can use the cinstallman command to update a running system, as well as, images on that system.

                                  • When using a node aggregation, for example, the asterisk (*), as shown in the examples above, if a node happens to be unreachable, it is skipped. Therefore, you should ensure that all expected nodes get their updated packages.

                                  • For more information on the crepo and cinstallman commands, see “crepo Command” in Chapter 4 and “cinstallman Command” in Chapter 4, respectively.

                                  Updating Packages Within System Imager Images

                                  You can also use the cinstallman command to update systemimager images with the latest software packages.


                                  Note: Changes to the kernel package inside the compute image require some additional steps before the new kernel can be used on compute nodes (see “Additional Steps for Compute Image Kernel Updates” for more details). This note does not apply to rack leader controllers (RLCs) or managed service nodes.


                                  The following examples show how to upgrade the packages inside the three node images supplied by SGI:

                                  admin:~ # cinstallman --update-image --image lead-sles11 
                                  admin:~ # cinstallman --update-image --image service-sles11 
                                  admin:~ # cinstallman --update-image --image compute-sles11


                                  Note: Changes to the compute image on the system admin controller (SAC) are not seen by the compute nodes until the updates have been pushed to the RLCs with the cimage command. Updating RLC and managed service node images ensure that the next time you add or re-discover or re-image an RLC or service node, it will already contain the updated packages.


                                  Before pushing the compute image to the RLC using the cimage command, it is good idea to clean the yum cache.


                                  Note: The yum cache can grow and is in the writable portion of the compute blade image. This means it is replicated 64 times per compute blade image per rack and the space that may be used by compute blades is limited by design to minimize network and load issues on RLCs.


                                  To clean the yum cache, from the SAC, perform the following:

                                  admin:~ # cinstallman --yum-image --image compute-sles11 clean all

                                  Additional Steps for Compute Image Kernel Updates

                                  Any time a compute image is updated with a new kernel, you will need to run some additional steps in order to make the new kernel available. The following example assumes that the compute node image name is compute-sles11 and that you have already updated the compute node image in the image directory per the instructions in “Creating Compute and Service Node Images Using the cinstallman Command” in Chapter 4. If you have named your compute node image something other than compute-sles11, replace this in the example that follows:

                                  1. Shut down any compute nodes that are running the compute-sles11 image (see “Power Management Commands” in Chapter 4).

                                  2. Push out the changes with the cimage --push-rack command, as follows:

                                    admin:~ # cimage --push-rack compute-sles11 r\* 

                                  3. Update the database to reflect the new kernel in the compute-sles11, as follows:

                                    admin:~ # cimage --update-db compute-sles11

                                  4. Verify the available kernel versions and select one to associate with the compute-sles11 image, as follows:

                                    admin:~ # cimage --list-images

                                  5. Associate the compute nodes with the new kernel/image pairing, as follows:

                                    admin:~ # cimage --set compute-sles11 2.6.16.46-0.12-smp "r*i*n*"


                                    Note: Replace 2.6.16.46-0.12-smp with the actual kernel version.


                                  6. Reboot the compute nodes with the new kernel/image.

                                  Installing MPI on a Running SGI ICE X System

                                  This section describes how to install MPI on an SGI ICE X system that has already been installed. The instructions in this section update existing images instead of creating new ones. It should be noted that integrating MPI before cluster deployment is easier.

                                  SGI supplied media, such as SGI® MPI and SGI® Accelerate™ CDs, have embedded in them suggested package lists for each node type. The crepo command, used in the following example, makes use of these lists and indeed recomputes the lists when new media is added and then selected.

                                  File names in this example are just illustrations.

                                  Register SGI MPI and SGI Accelerate with SMC, as follows:

                                  # crepo --add accelerate-1.0-cd1-media-rhel6-x86_64.iso
                                  # crepo --add mpi-1.0-cd1-media-rhel6-x86_64.iso

                                  Update the crepo selected repositories so that all repositories associated with the software distribution (distro) you are installing for are present. For example, if you want MPI to work on RHEL 6, you might do something like this:

                                  Show what is currently selected (the asterisks to the left):

                                  # crepo --show
                                  * SGI-Management-Center-1.5-rhel6 : /tftpboot/sgi/SGI-Management-Center-1.5-rhel6
                                  * SGI-Foundation-Software-2.5-rhel6 : /tftpboot/sgi/SGI-Foundation-Software-2.5-rhel6
                                  * SGI-XFS-XVM-2.5-for-RHEL-rhel6 : /tftpboot/sgi/SGI-XFS-XVM-2.5-for-RHEL-rhel6
                                  * SGI-Accelerate-1.3-rhel6 : /tftpboot/sgi/SGI-Accelerate-1.3-rhel6
                                  * SGI-Tempo-2.5-rhel6 : /tftpboot/sgi/SGI-Tempo-2.5-rhel6
                                  * SGI-MPI-1.3-rhel6 : /tftpboot/sgi/SGI-MPI-1.3-rhel6
                                  * Red-Hat-Enterprise-Linux-6.2 : /tftpboot/distro/rhel6.2
                                  

                                  Unselect unrelated repositories:

                                  # crepo --unselect SGI-Tempo-2.5-rhel6
                                  Updating: /etc/opt/sgi/rpmlists/generated-compute-rhel.6.2.rpmlist
                                  Updating: /etc/opt/sgi/rpmlists/generated-service-rhel6.2.rpmlist
                                  # crepo --unselect SGI-Foundation-Software-2.5-rhel6
                                  Updating: /etc/opt/sgi/rpmlists/generated-compute-rhel6.2.rpmlist
                                  Updating: /etc/opt/sgi/rpmlists/generated-service-rhel6.2.rpmlist
                                  # crepo --unselect Red-Hat-Enterprise-Linux-Server-6.2
                                  Removing: /etc/opt/sgi/rpmlists/generated-compute-rhel6.2.rpmlist
                                  Removing: /etc/opt/sgi/rpmlists/generated-service-rhel6.2.rpmlist
                                  

                                  Select RHEL 6 related repositories:

                                  # crepo --select Red-Hat-Enterprise-Linux-6.2
                                  Updating: /etc/opt/sgi/rpmlists/generated-compute-rhel6.2.rpmlist
                                  Updating: /etc/opt/sgi/rpmlists/generated-lead-rhel6.2.rpmlist
                                  Updating: /etc/opt/sgi/rpmlists/generated-service-rhel6.2.rpmlist
                                  # crepo --select SGI-Foundation-Software-2.5-rhel6
                                  Updating: /etc/opt/sgi/rpmlists/generated-compute-rhel6.2.rpmlist
                                  Updating: /etc/opt/sgi/rpmlists/generated-lead-rhel6.2.rpmlist
                                  Updating: /etc/opt/sgi/rpmlists/generated-service-rhel6.2.rpmlist
                                  # crepo --select SGI-XFS-XVM-2.5-for-RHEL-rhel6
                                  Updating: /etc/opt/sgi/rpmlists/generated-compute-rhel6.2.rpmlist
                                  Updating: /etc/opt/sgi/rpmlists/generated-lead-rhel6.2.rpmlist
                                  Updating: /etc/opt/sgi/rpmlists/generated-service-rhel6.2.rpmlist
                                  

                                  After performing the steps, above, the proper repositories are registered and selected so you can operate on them by default. Since you are using an already deployed system, you need to update existing images and potentially existing service nodes themselves. This example uses SGI suggested/ generated rpmlists. If you have custom rpmlists, you need to manually reconcile the two lists for each node type. The list fragments in /var/opt/sgi/sgi-repodata/ may help you.

                                  For a service node image, perform the following:

                                  # cinstallman --refresh-image --image service-rhel6.2 --rpmlist /etc/opt/sgi/rpmlists/generated-service-rhel6.2.rpmlist
                                  

                                  For a compute node image, perform the following:

                                  # cinstallman --refresh-image --image compute-rhel6.2 --rpmlist /etc/opt/sgi/rpmlists/generated-compute-rhel6.2.rpmlist
                                  

                                  Finally, you need to push the updated compute image to the rack leader controllers (RLCs).


                                  Note: If the compute nodes are booted on the image and are using NFS for roots, you need to shut the compute nodes down before being able to run this command.


                                  # cimage --push-rack compute-rhel6.2 r"*"
                                  

                                  To make sure the compute nodes your are operating on have the associated compute image you just updated, perform a command similar to the following”

                                  # cimage --set compute-rhel6.2 2.6.32-71.el6.x86_64 "*"

                                  You can find the available images and kernels using the cimage --list-images command.

                                  If you have booted service/login nodes, you likely want to refresh those running nodes also. (You could also reinstall them, as well). Here is a refresh example:

                                  # cinstallman --refresh-node --node service0 --rpmlist /etc/opt/sgi/rpmlists/generated-service-rhel6.2.rpmlist
                                  

                                  Now reset or bring up the nodes (depends on the state you left them). If you want to bring up all nodes, this command will not disrupt nodes already operating:

                                  # cpower --system --up

                                  SGI Management Center for SGI ICE X
                                  (document number: 007-5787-002 / published: 2012-12-04)    table of contents  |  additional info  |  download

                                      Front Matter
                                      About This Guide
                                      Chapter 1. SGI ICE X System Overview
                                      Chapter 2. Customizing a Factory-installed SGI ICE X System
                                      Chapter 3. Installing and Configuring an SGI ICE X System
                                      Chapter 4. System Operation
                                      Chapter 5. System Fabric Management
                                      Chapter 6. System Maintenance, Monitoring, and Debugging
                                      Chapter 7. Troubleshooting
                                      Appendix A. Out of Memory Adjustment
                                      Appendix B. Installing a Highly Available System Admin Controller (SAC) or Rack Leader Controller (RLC)
                                      Appendix C. YaST2 Navigation
                                      Index


                                  home/search | what's new | help