Compute Node Boot Options¶
One of the unique advantages of Scyld ClusterWare is the fast and flexible boot
procedure for compute nodes. The Scyld
BeoBoot system is a
combination of unified booting and a carefully designed light-weight
compute node environment. The
BeoBoot system allows compute nodes to
initialize with a very small boot image that may be stored on a wide
range of boot media. This small boot image never has to change; however,
Scyld ClusterWare’s boot setup allows you to change the kernel the compute nodes run,
the modules that are loaded, and every aspect of the application
environment by changing a few files on the master node.
This chapter gives instructions for setting up different types of boot media for the compute nodes, changing various settings that control the boot process, and checking for boot error messages. A detailed description of the boot process is included in the ClusterWare technical description in Scyld ClusterWare Design Overview.
Compute Node Boot Media¶
There are several ways to boot a compute node with Scyld ClusterWare, as discussed in the following sections. The methods described are all interchangeable, and they work seamlessly with each other. Thus, you can have some of your compute nodes boot using one method and other nodes boot with a different method.
PXE is a protocol that defines a standard way to netboot x86-based machines. In order for PXE to work, your compute nodes must have support for it in both the network adapters as well as the BIOS. The option to PXE boot must also be turned on in the BIOS. This is the preferred method of booting nodes in a Scyld cluster.
You can configure a node to boot from its local harddrive. See Managing Non-Scyld Nodes for details.
Linux BIOS is a project to replace the BIOS of a machine with Linux. This greatly speeds up the boot process as most of the actual work done by the BIOS is designed to make things like DOS work, but which aren’t really needed by Linux.
There has been work done by third parties so that it is a Scyld ClusterWare initial image that replaces the BIOS. This has the advantage that all you need for a compute node is a motherboard with ram, processor, built-in network adapter, and a power supply.
Linux BIOS is not supported by Penguin Computing, Inc., however you can see http://www.linuxbios.org/ for more information if you are interested.
Although not Scyld specific, using a flash disk is mentioned as it can
increase cluster reliability. A flash disk is a solid state device using
an Electrical Erasable PROM (EEPROM). The devices are seen by the BIOS
as an IDE or SCSI harddrive, and support all normal drive operations,
beofdisk and installing the initial boot image.
This allows a node cluster configuration with no moving parts other than
cooling fans, and is an alternative to using the Linux BIOS. These
devices are faster and cheaper than harddrives, and are currently
limited to 4 MB to 512 MB. But, for booting, less than 2 MB would be
Changing Boot Settings¶
Adding Steps to the node_up Script¶
If you wish to add more steps to be executed during the
script, you can do it without actually editing the script. Instead, you
create a script in the
/etc/beowulf/init.d/ directory. All scripts
in this directory will be executed for each node that boots up. This
script will be sourced by the
node_up script when the specified node
boots, therefore it must be written in standard sh. When your script is
sourced, the variable $NODE will be set to the node number that is
booting. See Special Directories, Configuration Files, and Scripts for more details.
Starting with Scyld Series 30, support is provided for specifying kernel
image and kernel command line parameters on a per-node basis in the
cluster config file
/etc/beowulf/config. This enables one set of
nodes to boot with a particular
initrd image, while another group
boots with a different one.
The utility of this feature can be illustrated by the use of the
memtest86 memory testing utility. For example, if you had just
expanded your cluster with 5 new nodes (nodes 16 through 20), and you
wanted to test their memory before putting them into production, you
could have them all boot into
memtest86 rather than the usual Scyld
initrd with the following entry in
kernelimage 16-20 /var/beowulf/boot/memtest86.bin initrdimage 16-20 none kernelcommandline 16-20 none
Other Per-Node Config Options¶
The cluster config file
/etc/beowulf/config provides per-node
support for node state changes, which allows the use of other scripts or
tools to control and manipulate the wake, alert, and down states
of nodes in the cluster.
There are a number of ways to check for errors that occur during the compute node boot process, as follows:
During the compute node boot process, any error messages are sent to the console of the compute node and forwarded to the cluster’s syslog server’s
/var/log/messagesfile by the node’s
beoklogddaemon. By default, the syslog server is the master node. See the syslog_server= option in Compute node command-line options for details about how to direct these compute node logging messages to an alternate server. Messages can be viewed by manually editing this file read-only or by running the standard Linux System Logs tool: Select System Tools -> System Logs from the desktop menu to open the System Logs window, then select the System Log from the list of logs in the left panel, then scroll near the end to see errors.
During each node’s boot, the
node_upscript writes node-specific output to a log file
<nodenumber>is the node number. If the compute node ends up in the error state, or if it remains in the boot state for an extended length of time, then you should examine this node log.