Compute Node Boot Options

One of the unique advantages of Scyld ClusterWare is the fast and flexible boot procedure for compute nodes. The Scyld BeoBoot system is a combination of unified booting and a carefully designed light-weight compute node environment. The BeoBoot system allows compute nodes to initialize with a very small boot image that may be stored on a wide range of boot media. This small boot image never has to change; however, Scyld ClusterWare’s boot setup allows you to change the kernel the compute nodes run, the modules that are loaded, and every aspect of the application environment by changing a few files on the master node.

This chapter gives instructions for setting up different types of boot media for the compute nodes, changing various settings that control the boot process, and checking for boot error messages. A detailed description of the boot process is included in the ClusterWare technical description in Scyld ClusterWare Design Overview.

Compute Node Boot Media

There are several ways to boot a compute node with Scyld ClusterWare, as discussed in the following sections. The methods described are all interchangeable, and they work seamlessly with each other. Thus, you can have some of your compute nodes boot using one method and other nodes boot with a different method.

PXE

PXE is a protocol that defines a standard way to netboot x86-based machines. In order for PXE to work, your compute nodes must have support for it in both the network adapters as well as the BIOS. The option to PXE boot must also be turned on in the BIOS. This is the preferred method of booting nodes in a Scyld cluster.

Local Disk

You can configure a node to boot from its local harddrive. See Managing Non-Scyld Nodes for details.

Linux BIOS

Linux BIOS is a project to replace the BIOS of a machine with Linux. This greatly speeds up the boot process as most of the actual work done by the BIOS is designed to make things like DOS work, but which aren’t really needed by Linux.

There has been work done by third parties so that it is a Scyld ClusterWare initial image that replaces the BIOS. This has the advantage that all you need for a compute node is a motherboard with ram, processor, built-in network adapter, and a power supply.

Linux BIOS is not supported by Penguin Computing, Inc., however you can see http://www.linuxbios.org/ for more information if you are interested.

Flash Disk

Although not Scyld specific, using a flash disk is mentioned as it can increase cluster reliability. A flash disk is a solid state device using an Electrical Erasable PROM (EEPROM). The devices are seen by the BIOS as an IDE or SCSI harddrive, and support all normal drive operations, including running beofdisk and installing the initial boot image. This allows a node cluster configuration with no moving parts other than cooling fans, and is an alternative to using the Linux BIOS. These devices are faster and cheaper than harddrives, and are currently limited to 4 MB to 512 MB. But, for booting, less than 2 MB would be needed.

Changing Boot Settings

Adding Steps to the node_up Script

If you wish to add more steps to be executed during the node_up script, you can do it without actually editing the script. Instead, you create a script in the /etc/beowulf/init.d/ directory. All scripts in this directory will be executed for each node that boots up. This script will be sourced by the node_up script when the specified node boots, therefore it must be written in standard sh. When your script is sourced, the variable $NODE will be set to the node number that is booting. See Special Directories, Configuration Files, and Scripts for more details.

Per-Node Parameters

Starting with Scyld Series 30, support is provided for specifying kernel image and kernel command line parameters on a per-node basis in the cluster config file /etc/beowulf/config. This enables one set of nodes to boot with a particular initrd image, while another group boots with a different one.

The utility of this feature can be illustrated by the use of the memtest86 memory testing utility. For example, if you had just expanded your cluster with 5 new nodes (nodes 16 through 20), and you wanted to test their memory before putting them into production, you could have them all boot into memtest86 rather than the usual Scyld initrd with the following entry in /etc/beowulf/config:

kernelimage 16-20 /var/beowulf/boot/memtest86.bin
initrdimage 16-20 none
kernelcommandline 16-20 none

Other Per-Node Config Options

The cluster config file /etc/beowulf/config provides per-node support for node state changes, which allows the use of other scripts or tools to control and manipulate the wake, alert, and down states of nodes in the cluster.

Error Logs

There are a number of ways to check for errors that occur during the compute node boot process, as follows:

  • During the compute node boot process, any error messages are sent to the console of the compute node and forwarded to the cluster’s syslog server’s /var/log/messages file by the node’s beoklogd daemon. By default, the syslog server is the master node. See the syslog_server= option in Compute node command-line options for details about how to direct these compute node logging messages to an alternate server. Messages can be viewed by manually editing this file read-only or by running the standard Linux System Logs tool: Select System Tools -> System Logs from the desktop menu to open the System Logs window, then select the System Log from the list of logs in the left panel, then scroll near the end to see errors.

  • During each node’s boot, the node_up script writes node-specific output to a log file /var/log/beowulf/node.<nodenumber>, where <nodenumber> is the node number. If the compute node ends up in the error state, or if it remains in the boot state for an extended length of time, then you should examine this node log.