Chapter 3. Features of the RAID Controller

This chapter describes features and operation of the RAID controller in the following sections:

Enclosure Services Interface (ESI) and Disk Drive Control

Both the JOBD and RAID LRC I/O modules use enclosure services interface (ESI) commands to manage the physical storage system. ESI provides support for disk drives, power supply, temperature, door lock, alarms, and the controller electronics for the enclosure services. The storage system ESI/ops panel firmware includes SES.


Note: These services are performed by drives installed in bays 1/1 and 4/4; these drives must be present for the system to function. See Figure 1-17 for diagrams of their location.

ESI is accessed through an enclosure services device, which is included in the ESI/ops module. SCSI commands are sent to a direct access storage device (namely, the drives in bays 1/4 and 4/4), and are passed through to the SES device.

During controller initialization, each device attached to each loop is interrogated, and the inquiry data is stored in controller RAM. If ESI devices are detected, the ESI process is started. The ESI process polls and updates the following data:

  • Disk drive insertion status

  • Power supply status

  • Cooling element status

  • Storage system temperature

The LEDs on the ESI/ops panel show the status of these components.

Configuration on Disk (COD)

Configuration on disk (COD) retains the latest version of the saved configuration at a reserved location on every physical drive. The RAID Controller in the 2Gb TP9100 (Mylex FFx-2) uses COD version 2.1. Previous versions of the TP9100 use COD version 1.0.

Controller firmware versions prior to 7.0 use COD 1.0 format. Firmware versions 7.0 and later use COD 2.1 format. FFX-2 RAID controller support started on version 8.0 firmware.

The COD information stored on each drive is composed of the following:

  • Device definition, which contains the following information.

    • The logical device definition/structure for those logical devices dependent on this physical device. This information should be the same for all physical devices associated with the defined logical device.

    • Any physical device information pertaining to this physical device that is different for different physical devices even though they may be part of the same logical device definition.

    • Data backup for data migration. This area also includes required information for the Background initialization feature.

  • User device name information and host software configuration parameters. This information is defined by the user and should be the same on all physical drives that are associated with the defined logical drive.

  • COD 2.1 locking mechanism. This feature is designed to provide a locking mechanism for multiple controller systems. If any of the controllers is allowed to update COD information independently of the other controllers, this feature allows the controller to lock the COD information for write access before updating the that drive. This feature prevents multiple controllers from updating the COD at the same time.

COD plays a significant role during the power-on sequence after a controller is replaced. The replacement controller tests the validity of any configuration currently present in its NVRAM. Then, it test the validity of the COD information on all disk drives in the storage system. The final configuration is determined by the following rules:

  1. The controller will use the most recent COD information available, no matter where it is stored. The most recent COD information is updated to all configured drives. Unconfigured drives are not updated; all COD information on these drives is set to zero.

  2. If all of the COD information has an identical timestamp, the controller will use the COD information stored in its NVRAM.


    Caution: Any existing COD on a disk drive that is inserted after the controller has started (STARTUP COMPLETE) will be overwritten.



    Caution: Mixing controllers or disk drives from systems running different versions of firmware presents special situations that may affect data integrity. If a new disk drive containing configuration data is added to an existing system while power is off, the controller may incorrectly adopt the configuration data from the new drive. This may destroy the existing valid configuration and result in potential loss of data. Always add drives with the power supplied to the system to avoid potential loss of data.


Drive Roaming

Drive roaming allows disk drives to be moved to other channel/target ID locations while the system is powered down. Drive roaming allows for easier disassembly and assembly of systems, and potential performance enhancement by optimizing channel usage.

Drive roaming uses the Configuration on Disk (COD) information stored on the physical disk drive. When the system restarts, the controller generates a table that contains the current location of each disk drive and the location of each drive when the system was powered down. This table is used to remap the physical disk drives into their proper location in the system drive. This feature is designed for use within one system environment, for example, a single system or a cluster of systems sharing a simplex or dual-active controller configuration. Foreign disk drives containing valid COD information from other systems must not be introduced into a system. If the COD information on a replacement disk drive is questionable or invalid, the disk drive will be labeled unconfigured offline or dead.

If a drive fails in a RAID level that uses a hot spare, drive roaming allows the controller to keep track of the new hot spare, which is the replacement for the failed drive.


Caution: Mixing controllers or disk drives from systems running different versions of firmware presents special situations that may affect data integrity. If a new disk drive containing configuration data is added to an existing system while power is off, the controller may incorrectly adopt the configuration data from the new drive. This may destroy the existing valid configuration and result in potential loss of data. Always add drives with the power supplied to the system to avoid potential loss of data.


Data Caching

RAID controllers can be operated with write cache enabled or disabled. This section describes the modes in the following subsections:

Write caching is set independently for each system drive in the system management software.

Write Cache Enabled (Write-back Cache Mode) 

If write cache is enabled (write-back cache mode), a write completion status is issued to the host initiator when the data is stored in the controller's cache, but before the data is transferred to the disk drives. In dual-active controller configurations with write cache enabled, the write data is always copied to the cache of the second controller before completion status is issued to the host initiator.

Enabling write cache enhances performance significantly for data write operations; there is no effect on read performance. However, in this mode a write complete message is sent to the host system as soon as data is stored in the controller cache; some delay may occur before this data is written to disk. During this interval there is risk of data loss in the following situations:

  • If only one controller is present and this controller fails.

  • If power to the controller is lost and its internal battery fails or is discharged.

Write Cache Disabled (Write-through or Conservative Cache Mode)

If write cache is disabled (write-through data caching is enabled), write data is transferred to the disk drives before completion status is issued to the host initiator. In this mode, system drives configured with the write cache enabled policy are treated as though they were configured with write cache disabled, and the cache is flushed.

Disabling write cache (enabling write-through or conservative mode) provides a higher level of data protection after a critical storage system component has failed. When the condition disabling write cache is resolved, the system drives are converted to their original settings.

Conditions that disable write cache are as follows:

  • The Enable Conservative Cache controller parameter is enabled in TPM for a dual-active controller configuration, and a controller failure has occurred.

  • A power supply has failed (not simply that a power supply is not present).  

    In this case the SES puts the RAID into conservative cache mode. This condition also triggers the audible alarm.

  • An out-of-limit temperature condition exists.

    In this case the SES puts the RAID into conservative cache mode.  This condition also triggers the audible alarm.

  • The controller receives an indication of an AC failure.

To protect against single-controller failure, certain releases of the storage system support dual controllers. To protect against power loss, an internal battery in the controller module maintains the data for up to 72 hours.

RAID Disk Topologies

The 2 Gb TP9100 RAID enclosure can be configured with any of the following topologies:

Simplex Single-port RAID Topology

Figure 3-1 illustrates a simplex single-port RAID configuration that uses a single host. This configuration:

Supports transfer speeds up to 200 MB/s

Does not support failover capabilities

Figure 3-1. Simplex Single-port RAID Topology

Simplex Single-port RAID Topology

Duplex Single-Port RAID Topology

Figure 3-2 illustrates a duplex single-port RAID configuration. This configuration:

  • Supports transfer speeds up to 400 MB/s

  • Supports failover capabilities

  • Supports SGI FailSafe high-availability solutions

    Figure 3-2. Duplex Single-port RAID Topology

    Duplex Single-port RAID Topology

Simplex Dual-Port RAID Topology

Figure 3-3 illustrates a simplex dual-port RAID configuration using two hosts. This configuration:

  • Supports transfer speeds up to 400 MB/s

  • Supports failover capabilities

  • Supports SGI FailSafe high-availability solutions

    Figure 3-3. Simplex Dual-port Dual-host RAID Topology

    Simplex Dual-port Dual-host RAID Topology

Duplex Dual-Port RAID Topology

Figure 3-4 illustrates a duplex dual-port RAID configuration using two hosts and two controllers. This configuration:

  • Supports transfer speeds up to 400 MB/s

  • Supports failover capabilities

  • Supports SGI FailSafe high-availability solutions

    Figure 3-4. Duplex Dual-port RAID Configuration

    Duplex Dual-port RAID Configuration


    Caution: If two independent systems access the same volume of data and the operating system does not support file locking, data corruption may occur. To avoid this, create two or more volumes (or LUNs) and configure each volume to be accessed by one system only.


Dual-port Duplex Two-Host RAID Configuration

Figure 3-5 illustrates a dual-port, duplex, dual-path RAID configuration that uses two hosts. This configuration:

  • Supports transfer speeds up to 400 MB/s

  • Supports failover capabilities

  • Supports SGI FailSafe high-availability solutions

    Figure 3-5. Dual-port Dual-path Attached Duplex RAID Topology

    Dual-port Dual-path Attached Duplex RAID Topology


    Caution: If two independent systems access the same volume of data and the operating system does not support file locking, data corruption may occur. To avoid this, create two or more volumes (or LUNs) and configure each volume to be accessed by one system only.


Dual-Port Duplex RAID Configuration

Figure 3-6 illustrates a dual-port quad-path attached duplex RAID configuration. This configuration supports the following features:

  • Transfer speeds up to 400 MB/s

  • Failover capabilities

  • SGI FailSafe high-availability solution

    Figure 3-6. Dual-port Quad-path Duplex RAID Topology

    Dual-port Quad-path Duplex RAID Topology


    Caution: If two independent systems access the same volume of data and the operating system does not support file locking, data corruption may occur. To avoid this, create two or more volumes (or LUNs) and configure each volume to be accessed by one system only.