Important Files on Head Nodes¶

The ~/.scyldcw/ Folder¶

As described elsewhere in this document, ClusterWare administrator tools read some configuration details from the user's ~/.scyldcw/settings.ini file. This section describes the other common contents of the ~/.scyldcw/ folder. Although this is included in the Important Files on Head Nodes chapter, please note that this folder exists in the home directory of any user who executes the ClusterWare tools, and that these tools are intended to be installed not just on the head node, but also wherever an administrator finds convenient and has appropriate HTTP or HTTPS access to the head node.

logs/¶

The command line tools also log their arguments and some execution progress in the ~/.scyldcw/logs/ folder. By default each tool keeps logs of its previous five runs, though this number can be adjusted in the settings.ini file by resetting the logging.max_user_logs value. Set this value to zero to discard all logs, and set to a negative number to preserve logs indefinitely. Administrators may be asked to provide these logs (usually via the scyld-sysinfo tool) when requesting assistance from Penguin Computing technical support.

workspace/¶

The ~/.scyldcw/workspace/ folder is used by the scyld-modimg tool to store, unpack, and manipulate image contents. Root file system images are large, which means this local image cache can grow large. Administrators are encouraged to delete unneeded entries in the cache using the scyld-modimg --delete command, either with the -i (or --image) argument to name specific images, or with --all to delete all local images. This will not delete the remote copies of images stored on the head nodes, just delete the local cache. Within this folder, the manifest.json file contains JSON formatted information about the cached images, while the images themselves are stored as individual packed files with names based upon their UID. If the cached images are ever out of sync with the manifest, i.e. a file is missing or an extra file is present, then the scyld-modimg tool will print a warning:

WARNING: Local cache contains inconsistencies.  Use --clean-local
to delete temporary files, untracked files, and remove missing
files from the local manifest.

This warning can be automatically cleared by running the tool with the --clean-local option. This is not done automatically in case some useful image or other data might be lost. Alternatively, if the manifest.json is somehow lost, a new file can be constructed for a collection of images using the --register-all option. See the command documentation for more details.

The location of the workspace folder can be controlled on the scyld-modimg command line or by the modimg.workspace variable in the settings.ini file.

parse_failures/¶

Several ClusterWare tools execute underlying Linux commands, such as rpm or yum, and parse their output to check for details of success or failure. During execution and parsing, the stdout and stderr of the Linux commands are cached in the ~/.scyldcw/parse_failures/ folder. If the parsing completes, regardless of the command success or failure, these files will be deleted, but when a tool crashes or parsing fails, these files will be left behind. Though not generally useful to an administrator during normal operation, these output files could be useful for debugging problems and may be requested by Penguin Computing technical support. Much like files in the ~/.scyldcw/logs/ folder, these parse failures can be periodically purged if no problems are encountered, though be aware that useful debugging information may be lost.

The `/opt/scyld/clusterware/` Folder¶

The /opt/scyld/clusterware folder exists only on a head node and contains the core ClusterWare installation. Selected contents are described below.

/opt/scyld/clusterware/bin/¶

Tools located in the bin/ folder are intended to be run as root only on head nodes and are rarely executed directly. This is where the managedb tool is located, as well as the pam_authenticator application described in the Installation & Administrator Guide and the randomize_ini script executed during initial installation.

/opt/scyld/clusterware/conf/¶

The conf/ folder contains the principal configuration files for ClusterWare REST API. In that folder the httpd_*.conf files are used in the actual Apache configuration, while the INI files control the behavior of the Python Pyramid-based service. Modifications to any of these files requires the administrator to restart the clusterware service. Also note that modifications to these files only affect the one head node and may need to be replicated to other head nodes in multihead configurations. Because of this, future releases may move selected variables from the base.ini file into the ClusterWare database to provide a cluster-wide effect.

Many aspects of the REST service can be tweaked through changes to variables in the base.ini, and these are discussed throughout this documentation. To list all available variables please use the managedb tool:

sudo /opt/scyld/clusterware/bin/managedb --print-options

This command will list all options registered with the configuration system, and although many of these options are for internal use only, Penguin Computing technical support may suggest changes in individual cases. The specific variables available and their effects may change in future releases.

The variable names take a general form of SUBSYSTEM.VARIABLE or PLUGIN.VARIABLE. For example, the plugins subsystem is controlled through these variables, and a specific authentication plugin is selected by the plugins.auth variable. Further, what application the appauth plugin uses is controlled by the appauth.app_path variable. For a description of this specific plugin, please see Securing the Cluster. Other variables in the base.ini file follow similar patterns.

Variables in the production.ini file are used to control aspects of the Python Pyramid framework, specifically logging. Variables in this file are also for internal use and should not be modified except by the suggestion of Penguin Computing technical support.

/opt/scyld/env/, modules/, and src/¶

The env/, modules/, and src/ folders contain the Python virtual environment, including the libraries required by the scyld-* and other tools.

/opt/scyld/clusterware/parse_failures/¶

Similar to the individual administrator ~/scyldcw/parse_failures/, files in this folder will accumulate any parsing failures found while running underlying Linux commands and should generally be empty. If files are accumulating here, it is safe to delete them, but the ClusterWare developers should be informed and may request a sample of the files to diagnose the underlying failure.

/opt/scyld/clusterware/storage/¶

The storage/ folder is the default location used by the local_files plugin to store kernels, initramfs files, and packed root file systems. The actual location of this folder is controlled by the local_files.path variable in the base.ini configuration file.

This folder can grow relatively large depending on the size and quantity of root file systems in the cluster. Most organizations will want to include the storage folder in their backup planning along with the database contents obtained through scyld-install --save or the managedb save command. See Backup and Restore for additional discussion of backup up the database contents.

/opt/scyld/clusterware/workspace/¶

The REST service running on each head node requires a location to hold temporary files. This location is controlled by the head.workspace variable and defaults to /opt/scyld/clusterware/workspace/. Like the storage/ directory, workspace/ can grow to relatively large size, but unlike storage/ does not need to be backed up. Any files or directories found in this folder are temporary and should be deleted when the service is shut down or restarted. If files or folders accumulate in this folder, they are safe to remove, although this must be done carefully or when the REST service is stopped. If files do accumulate here, please notify Penguin Computing developers so that we may diagnose the underlying issue.

Compute Node Initialization Scripts¶

All compute node images should include the clusterware-node package. This package includes systemd services used for periodically reporting node status back to the head node as well as initialization scripts run as the node is booting.

At the end of the boot process described in Node Images and Boot Configurations, the mount_rootfs script hands control of the machine over to the standard operating system initialization scripts when it switches to the newly mounted root. Shortly after networking is established on the booting node, it contacts the parent head node, the compute node begins periodic pushes of status information to the parent, which stores that information in the ClusterWare database. The first data push includes detected hardware information, while subsequent data only contains the more ephemeral node status information. With each status update the node also retrieves its attribute list and stores this list as an INI file at /opt/scyld/clusterware-node/etc/attributes.ini. Code running on the compute node can use the contents of this file to customize the node configuration. A simple attributes.ini file:

[Node]
UID = c1bf15749d724105bce9e07a3d79cb69

[Attributes]
_boot_config = DefaultBoot

The [Node] section will include node-specific details, while the [Attributes] section contains the node attributes as determined from the node's groups using the process described in Node Attributes. The clusterware-node package also contains a symlink at /etc/clusterware pointing to /opt/scyld/clusterware-node/etc/.

Shortly after the first status push, a series of shell scripts are executed on the node to perform ClusterWare-specific node initialization. These scripts are linked in /opt/scyld/clusterware-node/scripts-enabled and located in /opt/scyld/clusterware-node/scripts-available.

All such scripts should include /opt/scyld/clusterware-node/functions.sh for common variables and functions, and should use the attributes.ini described previously to determine what actions are necessary. Cluster administrators are invited to enable and disable these scripts in their root file system images as they see fit and to contribute improved or added scripts back to the ClusterWare developers for the continuing improvement of the product.