This chapter describes features supported in previous releases that enhance the features of your base Linux distribution. For a description of new features, please read Chapter 1, “Release Features”.
The SGI ProPack 1.2 for Linux- 230 Edition provides the Linux kernel version 2.2.13. The ProPack software adds functionality to base Linux distributions that is specific to SGI hardware platforms.
Some of the most significant features that Linux provides are listed below:
Typical commands you would expect to see on a UNIX-like system
Typical configuration files you would expect to see on a UNIX-like system, along with an optional graphical front end
Development tools such as compilers, debuggers, and libraries
Internet applications such as web servers and browsers, news servers, network utilities, e-mail servers, and clients
Everything needed for network file sharing with a wide variety of clients
Desktop environments and graphical applications
![]() | Note: For information about the software components that enable the graphical features of the SG230 system, see the file README.FIRST on the software CD, which is also available as /usr/doc/sgi/VisualWorkstation/README.FIRST after the software is installed. |
The SGI ProPack 1.2 for Linux- 230 Edition software also provides optimization that enhances performance on database and other workloads. SGI has added a number of features to the Linux kernel and certain packages to provide increased performance and manageability for database workloads (such as Oracle 8i).
The performance enhancements include a kernel-level implementation of POSIX 1003.1-1996 asynchronous I/O, a low-overhead interprocess synchronization mechanism, low overhead and high-volume raw disk I/O, a fast gettimeofday(3) library function, and support for large amounts of physical memory.
The manageability and supportability improvements include kernel spinlock metering (for performance bottleneck analysis), kernel profiling enhancements, kernel memory dump capability with analysis tools, kernel gdb hooks. The SGI ProPack 1.2 for Linux- 230 Edition also includes version 0.6 of the kernel debugger kdb. The features of kdb releases are documented at the following URL:
http://oss.sgi.com/projects/kdb
The manageability of the release has been improved by integrating a number of publicly available kernel patches, such as the following:
Stephen Tweedie's Raw I/O patch, which forms the basis for the SGI raw disk I/O enhancements. This patch is described in “Raw I/O Path Changes”.
The Device File System (CONFIG_DEVFS_FS) patch from Richard Gooch. This patch provides a more consistent naming scheme for hardware and software devices. Sites that expect to connect a large number of devices may find DEVFS very useful in helping to manage them. DEVFS can also provide the traditional Linux names for devices, for backward compatibility, and is otherwise very compatible with the rest of the Linux system.
The sard utility and associated kernel metrics patch for disk traffic analysis. This patch provides additional disk I/O statistics, useful for tuning database layouts and queries.
The performance enhancements, enumerated in the preceding section, accelerate the performance of I/O intensive applications by streamlining the kernel code and data paths for disk I/O as well as providing larger shared memory segments and a low overhead interprocess synchronization mechanism.
Current file-system-based disk I/O requires fixed size I/O operations (typically 1024 bytes) into kernel buffers, then the data is moved from the kernel buffer to the user program address space. While this allows the file system to cache frequently accessed data, it also consumes excess system bus bandwidth when copying the data from the kernel buffer(s) into the user address space. Both the small size of the I/O (2 sectors) and the copy operation greatly reduce the I/O subsystem throughput for database operations, where transactions and full-table scan operations operate more quickly with no operating system data intervention.
To help alleviate this problem, Stephen Tweedie of Red Hat developed a mechanism that allows disk I/O directly to a buffer in the application address space (historically known as raw (or unprocessed) I/O). This mechanism will lock the required pages of memory to prevent them from being paged out or swapped during the I/O operation. Applications required to perform this type of disk I/O would open the character special device /dev/raw and bind the disk device to a special raw device using an ioctl(2) system call.
This mechanism, however, is cumbersome to use and suffers from some deficiencies. The primary deficiency with the mechanism comes from its continued use of the file-system buffer-header data structures and associated device queueing routines. While use of the buffer headers is a straightforward mechanism, it implies that I/O operations will still need to be fragmented into 1024-bytes per operation, increasing the kernel overhead significantly. The binding mechanism used to bind an existing block device to a new raw device is also somewhat cumbersome and counterintuitive to Unix system administrators, who expect to find a relationship in the device namespace between a block device and its corresponding raw device.
To address these concerns, SGI has added additional capabilities to Stephen Tweedie's raw I/O patch that allow large I/O operations directly to the user address space and bypasses the bulk of the kernel I/O queueing code for SCSI and FiberChannel devices.
You can download a dd command that is capable of using the raw device features from the following FTP location:
ftp://oss.sgi.com/projects/rawio/download/dd.raw
This feature is off by default, but you can turn it on by setting the CONFIG_RAW kernel configuration parameter.
More information about raw I/O is available from the following URL:
While the UNIX System V IPC semaphore facility does provide exceptional capability, its performance leaves much to be desired. Many Unix vendors have released a low-overhead inter-application synchronization primitive known as “post /wait.”
SGI has included in this release a kernel level implementation of post/wait along with the library containing application API's. The post allows for a process to “wait” for an event. This event can either be a timeout or a “post” from another process. A group of cooperating processes can use these “post” and “wait” facilities to synchronize among themselves.
In order to use post/wait, the kernel must be compiled with the CONFIG_PW configuration variable, and you may optionally set an additional configuration variable, CONFIG_PW_VMAX. These variables are described in the configuration help. For a user program to use the post/wait facilities, it must link against libdba.so.
For more information on post/wait, please refer to the postwait(3) man page.
To correctly order operations in a database, the database must timestamp data and log entries frequently. Traditional gettimeofday(3) library functions are implemented as system calls, thus causing an address space switch (from user to kernel mode) each time a timestamp is required.
With this release, SGI has included a device driver that allows a read-only page of kernel memory, containing only the time value, to be mapped by an application (or on its behalf by a library function). gettimeofday is then implemented as a library function that simply reads the time value from the exported page in memory, thus avoiding a system call to obtain the time value.
In order to use the Fast gettimeofday(3) Library Function, the kernel must be compiled with CONFIG_SYSDAT and the user program must link against libdba in order to override the libc version of gettimeofday(3).
The ability to overlap I/O and processing activities has always been important to high-performance applications. To allow this type of overlap in single-threaded applications, SGI has included a kernel-level implementation of POSIX asynchronous I/O and the associated API library.
The SGI ProPack 1.2 for Linux- 230 Edition works with raw devices as well as with file systems including pipes and sockets.
This facility is turned on by setting the CONFIG_AIO kernel option. User code can get access to the facility by linking with libdba.so. Further information can be found in the /lib/libdba/README file.
The following NFS functionality has been added:
NFS version 3 client and server support.
Network Lock Manager (NLM) version 4 client and server support.
Kernel level NFS and NLM implementation.
NFS and NFSD are configured as modules by default, but they can be configured to compile as part of the kernel by setting the CONFIG_NFS_FS and CONFIG_NFSD configuration parameters. The CONFIG_NFS_V3 and CONFIG_NFSD_V3 parameters are set by default and can be turned off if the user wants to use NFS version 2 only. The CONFIG_NFSD parameter needs to be configured for LOCKD to work, so if CONFIG_LOCKD is set, CONFIG_NFSD should be set also.
The SGI ProPack 1.2 for Linux- 230 Edition includes a feature that allows developers to gather statistical information about the SMP kernel's use of spinlocks and mrlocks (multiple-reader single-writer spinlocks). This functionality is called spinlock metering, or lockmetering.
Spinlock metering is built into the kernel using the CONFIG_LOCKMETER configuration option (in the Kernel Hacking section of make xconfig). A kernel built with lockmetering will exhibit a small (roughly 1%) performance degradation relative to a kernel that is not configured for lockmetering. See the following URL for additional information:
The following changes have been made to the Linux crash utility, which are explained briefly below. General information about lcrash can be found in the cmd/lcrash/README file.
Linux kernel crash dump enhancements. SGI ProPack 1.2 for Linux- 230 Edition provides a configuration option to allow kernel crash dumps to be available. This option is configured to be on by default, and the default dump space is the first swap partition found when booting. If you are building a new kernel, you can specify Support kernel crash dump capabilities in the Kernel Hacking section of make xconfig.
The crash dump capabilities in the kernel allow the system to create a crash dump when a failure occurs due to a panic() call or an exception. For more details on the dump method, compression used, and so on, please read the LKCD FAQ at the following URL:
http://oss.sgi.com/projects/lkcd/faq.html
Information about LKCD is also available in the file cmd/lcrash/README.lkcd.
Boot up process changes. As the system boots up, the /sbin/vmdump script will be run out of /etc/rc.d/rc.sysinit. This script saves crash dumps and reads sysconfig variables to open the dump device and configure the system for crash dumps.
Crash dump configuration options. There are a number of configurable options to save system crash dumps. Please read /etc/sysconfig/vmdump for more details on the options available. The following list describes what the options allow you to do:
Determine if you want to implement crash dumps in the kernel
Choose whether to save crash dumps to disk or not
Change the location to which the crash dumps are saved
Specify any block dump device you want
Compress (or not compress) the crash dumps
Configure the system to reset (or not reset) after a failure
The lcrash utility now uses the new librl library for command line input.
The following list describes patches implemented and enhancements to configuration options, commands, and libraries:
librl library. This new library supplies command line editing and command history functionality. See the /cmd/lcrash/lib/librl/README file for information on how to use this library. The lcrash command uses this library.
Remote debugging over a serial line. There is a new configuration option, CONFIG_GDB, which is used to enable gdb debugging. To force a kernel that has been compiled with CONFIG_GDB to pause during the boot process and wait for a connection from gdb, the parameter gdb should be passed to the kernel. This can be done by typing gdb after the name of the kernel on the LILO command line. The patch defaults to use ttyS1 at a baud rate of 38400. These parameters can be changed by using gdbttyS=port number and gdbbaud=baud rate on the command line.
rlimits patch. In the Linux 2.2.13 kernel, faulty rlimit checking will not allow a process to have more than 2 GB total address space, stack size, or locked memory. This release has fixed the rlimit checking, so (subject to other accounting limitations), the kernel honors RLIM_INFINITY settings on these resources.
SMP PTE patch. In stock Linux, the page stealing code that is used under high memory load has a bug that might cause it to steal a page from a process without writing out the contents to swap if the page has been modified by the process. This bug is only present in a multiprocessor machine. SGI ProPack 1.2 for Linux- 230 Edition provides a fix for this bug.