Managing Non-Scyld Nodes¶
A ClusterWare cluster typically consists of a Scyld master node and one
or more Scyld compute nodes, integrated and communicating across the
private cluster network interface. However, ClusterWare also supports
additional devices and nodes that may reside on that private cluster
network. This section describes how these Scyld and non-Scyld nodes are
configured using entries in the /etc/beowulf/config
file.
DHCP IP address assignment to devices¶
The private cluster network may have one or more devices attached to it that issue a DHCP request to obtain a dynamic IP address, vs. the device being configured with a static IP address. Typically, only the master node (or nodes - see Managing Multiple Master Nodes) owns a static IP address.
Caution
Care must be taken with static IP addresses to guarantee there are no address collisions.
Examples of such devices are managed switches and storage servers. The
beoserv
DHCP service for such devices is configured using the host
directive, together with an associated hostname directive. For
example,
nodes 32
iprange 10.20.30.100 10.20.30.131 # IPaddr range of compute nodes
...
hostrange 10.20.30.4 10.20.30.9 # IPaddr range of devices for DHCP
hostrange 10.20.30.90 10.20.30.99 # IPaddr range of PDUs for DHCP
...
host 00:A0:D1:E9:87:CA 10.20.30.5 smartswitch
host 00:A0:D1:E3:FC:E2 10.20.30.90 pdu1
host 00:A0:D1:E3:FD:4A 10.20.30.91 pdu2
The host keyword affects both the beoserv DHCP server and how the ClusterWare NSS responds to hostname lookups. The host keyword associates a non-cluster entity, identified by its MAC address, to an IP address that should be delivered to that client entity, if and when it makes a DHCP request to the master node, together with one or more optional hostnames to be associated with this IP address.
If the hostname is provided, then normal NSS functionality is available. Using the above example, then:
[user1@cluster ~] $ getent hosts smartswitch
returns:
10.20.30.5 smartswitch
and
[user1@cluster ~] $ getent ethers 00:A0:D1:E9:87:CA
returns:
00:a0:d1:e9:87:ca smartswitch
Each host IP address must fall within a defined hostrange range of IP addresses. Moreover, each of the potentially multiple hostrange ranges must not overlap any other range, must not overlap the cluster compute nodes range that is defined by the iprange directive, and must not collide with IP address(es) of master node(s) on the private network.
Simple provisioning using PXE¶
A default node entry, such as:
node 00:A0:D1:E5:C4:6E 00:A0:D1:E5:C4:6F
or an explicitly numbered node entry, such as one for node15:
node 15 00:A0:D1:E5:C4:6E 00:A0:D1:E5:C4:6F
is assumed to be a Scyld node, and a PXE request from one of these MAC
addresses results in beoserv
provisioning the node with the kernel
image, initrd image, and kernel command-line arguments that are
specified in /etc/beowulf/config
file entries, e.g.:
kernelimage /boot/vmlinuz-2.6.18-164.2.1.el5.540g0000
initrdimage /var/beowulf/boot/computenode.initrd
kernelcommandline rw root=/dev/ram0 image=/var/beowulf/boot/computenode.rootfs
ClusterWare automatically maintains the config
file’s default
kernelimage to specify the same kernel that currently executes on the
master node. A Scyld node integrates into the BProc
unified process
space.
Enhanced syntax allows for custom booting of different kernel and initrd images. For example, specific nodes can boot a standalone RAM memory test in lieu of booting a full Linux kernel:
kernelimage 15 /var/beowulf/boot/memtest86+-4.00.bin
initrdimage 15 none
kernelcommandline 15 none
Thus when node15 makes a PXE request, it gets provisioned with the specified binary image that performs a memory test. In the above example, the initrdimage of none means that no initrd image is provisioned to the node because that particular memory test binary doesn’t need an initrd. Moreover, the node number specifier of 15 can be a range of node numbers, each of which would be provisioned with the same memory test.
Simple provisioning using the class directive¶
An optional config
file class directive assigns a name to a set of
image and kernel command-line arguments. The previous example can be
alternatively accomplished with:
class memtest kernelimage /var/beowulf/boot/memtest86+-4.00.bin
class memtest initrdimage none
class memtest kernelcommandline none
...
node 15 memtest 00:A0:D1:E5:C4:6E 00:A0:D1:E5:C4:6F
which results in the same memory test provisioning of node15 as seen earlier.
Similarly, the default Scyld node provisioning can be expressed as:
class scyld kernelimage /boot/vmlinuz-2.6.18-164.2.1.el5.540g0000
class scyld initrdimage /var/beowulf/boot/computenode.initrd
class scyld kernelcommandline rw root=/dev/ram0 image=/var/beowulf/boot/computenode.rootfs
...
node scyld pxe pxe 00:A0:D1:E5:C4:6E 00:A0:D1:E5:C4:6F
The first pxe is termed the boot-sequence
, and the second pxe is
termed the boot-stage
. The boot-stage
describes how beoserv
should respond to a node’s PXE request. In the example above, the
boot-stage
of pxe instructs beoserv
to respond to the node’s
first PXE request with the kernel image, initrd image, and kernel
command-line specified in the class scyld.
Booting a node from the local harddrive¶
The node entry’s boot-sequence
and boot-stage
have more
powerful capabilities. For example, suppose node15 is installed with a
full distribution of CentOS 4.8 on a local harddrive, and suppose the
master node’s config
file contains entries:
class genericboot kernelimage none
class genericboot initrdimage none
class genericboot kernelcommandline none
...
node 15 genericboot pxe+local local 00:A0:D1:E5:C4:6E 00:A0:D1:E5:C4:6F
When node15 boots, it first makes a DHCP request to join the private
cluster network, then it attempts to boot, abiding by the specific
sequence of boot devices named in its BIOS. ClusterWare expects that the
first boot device is PXE over Ethernet, and the second boot device is a
local harddrive. When node15 initiates its PXE request to the master
node, beoserv
sees the boot-stage
of local and thus directs
node15 to “boot next”, i.e., to boot from the local harddrive.
Provisioning a non-Scyld node¶
In the previous example, we assumed that node15 already had a
functioning, bootable operating system already installed on the node.
Having a preexisting installation is not a requirement. Suppose the
config
file contains entries:
class centos5u4 kernelimage /var/beowulf/boot/vmlinuz-centos5u4_amd64
class centos5u4 initrdimage /var/beowulf/boot/initrd-centos5u4_amd64.img
class centos5u4 kernelcommandline initrd=initrd-centos5u4_amd64.img
ks=nfs:10.1.1.1:/home/os/kickstarts/n5-ks.cfg ksdevice=eth0
...
node 15 centos5u4 pxe+local pxe 00:A0:D1:E5:C4:6E 00:A0:D1:E5:C4:6F
(where the kernelcommandline has been broken into two lines for
readability, although in reality it must be a single line in the
config
file). This time node15’s PXE request arrives, and the
boot-stage
of pxe directs beoserv
to respond with the class
centos5u4 kernel image, initrd image, and kernel command-line
arguments. The latter’s ks arguments informs node15’s kernel to
initiate a kickstart
operation, which is a Red Hat functionality
that provisions the requester with rpms and other configuration settings
as specified in the /home/os/kickstarts/n5-ks.cfg
kickstart
configuration file found on the master node. It is the responsibility of
the cluster administrator to create this kickstart file.
See Special Directories, Configuration Files, and Scripts for a sample configuration file.
After this initial PXE response (i.e., the pxe step of the pxe+local
boot-sequence
), beoserv
rewrites the node entry to change the
boot-stage
to the local second step of the pxe+local
boot-sequence
. For example,
node 15 centos5u4 pxe+local pxe 00:A0:D1:E5:C4:6E 00:A0:D1:E5:C4:6F
gets automatically changed to:
node 15 centos5u4 pxe+local local 00:A0:D1:E5:C4:6E 00:A0:D1:E5:C4:6F
What this accomplishes is: the first PXE request is met with a directive
to boot a kernel on node15 that initiates the kickstart
provisioning, and then any subsequent PXE request from node15
(presumably from a now-fully provisioned node) results in a beoserv
directive to node15 to “boot next”, i.e., to boot from the local
harddrive.
If the cluster administrator wishes to reprovision the node and start
fresh, then simply change the boot-stage
from local back to pxe,
and execute systemctl reload clusterware
to instruct beoserv
to re-read the config file to see your manual changes.
If you want the node to kickstart
reprovision on every boot (albeit
an unlikely scenario, but presented here for completeness), then you
would configure this using:
node 15 centos5u4 pxe pxe 00:A0:D1:E5:C4:6E 00:A0:D1:E5:C4:6F
Integrating a non-Scyld node into the cluster¶
A non-Scyld node that locally boots a full distribution operating system
environment may have an assigned IP address in the private cluster
network iprange, but it is initially invisible to the master node’s
monitoring tools and job manager. The bpstat
tool only knows about
Scyld nodes, and the more general beostatus
is ignorant of the
non-Scyld node’s presence in the cluster. The non-Scyld node is itself
ignorant about the names and IP addresses of other nodes in the cluster,
whether they be Scyld or non-Scyld nodes, until and unless the cluster
administrator adds each and every node into the non-Scyld node’s local
/etc/hosts
file.
This shortcoming can be remedied by installing two special ClusterWare
packages onto the non-Scyld node: beostat-sendstats
and
beonss-kickbackclient
. These packages contain the client-side pieces
of beostat
and beonss
. They are available in the standard
ClusterWare yum repository and are compatible with non-Scyld RHEL and
CentOS distributions - and perhaps with other distributions. One way to
judge compatibility is to determine what libraries the ClusterWare
daemons need to find on the non-Scyld compute node. (The daemons are
known to execute in recent RHEL and CentOS environments.) Examine the
daemons that were installed on the master node when ClusterWare was
installed:
ldd /usr/sbin/sendstats
ldd /usr/sbin/kickbackproxy
and then determine if the libraries that these binaries employ are present on the target non-Scyld node. If the libraries do so exist, then the special ClusterWare packages can be downloaded and installed on a non-Scyld node.
First, you should download the packages from the ClusterWare yum repo. A
useful downloader is the /usr/bin/yumdownloader
utility, which can
be installed from the CentOS extras yum repository if it is not
already installed on your master node:
[root@cluster ~] # yum install yum-utils
Then use the utility to download the special Penguin ClusterWare rpms:
[root@cluster ~] # yumdownloader --destdir=<localdir> beostat-sendstats beonss-kickbackclient
retrieves the rpms and stores them into the directory <localdir>, e.g.,
/var/www/html
or /etc/beowulf/nonscyld
.
These special packages can be installed manually on the non-Scyld node, or can be installed as part of the kickstart procedure (see Provisioning a non-Scyld node). Each package includes a /etc/init.d/ script that must be edited by the cluster administrator. Examine /etc/init.d/beostat-sendstats and /etc/init.d/beonss-kickbackclient, which contain comments that instruct the administrator about how to configure each script. Additionally, the non-Scyld node’s /etc/nsswitch.conf must be configured to invoke the kickback service for the databases that the administrator wishes to involve beonss and the master node. See the master node’s /etc/beowulf/nsswitch.conf for a guide to which databases are supported, e.g., hosts, passwd, shadow, and group. Finally, on the non-Scyld node, enable the scripts to start at node startup:
[root@cluster ~] # chkconfig beostat-sendstats on
[root@cluster ~] # chkconfig beonss-kickbackclient on