Important Files on Head Nodes¶
The ~/.scyldcw/ Folder¶
As described elsewhere in this document, ClusterWare administrator
tools read some configuration details from the user's
~/.scyldcw/settings.ini
file.
This section describes the other common contents of the ~/.scyldcw/
folder.
Although this is included in the Important Files on Head Nodes chapter,
please note that this folder exists in the home directory of any user who
executes the ClusterWare tools,
and that these tools are intended to be installed not just on
the head node, but also wherever an administrator finds convenient and
has appropriate HTTP or HTTPS access to the head node.
auth_tkt.cookie¶
Whenever a user authenticates to the REST API running on a head node,
an authentication cookie is generated and used for subsequent requests in
the same session.
Even though sessions typically end when the executed tool completes,
the command line tools caches the authentication
cookie in the ~/.scyldcw/auth_tkt.cookie
file to allow for faster
tool start times. A summary of the network requests are logged at the
DEBUG level:
[sysadmin@virthead ~]$ scyld-nodectl -vv ls
DEBUG: GETing /node/{uid} through /mux
INFO: No value provided for global option 'client.auth_tkt'.
DEBUG: Starting new HTTP connection (1): localhost:80
DEBUG: http://localhost:80 "GET /api/v1/whoami HTTP/1.1" 200 74
DEBUG: Starting new HTTP connection (1): localhost:80
DEBUG: http://localhost:80 "GET /api/v1/whoami HTTP/1.1" 200 109
INFO: Loaded authentication cookie from previous run.
DEBUG: http://localhost:80 "POST /api/v1/mux?log=GET-/node/UID HTTP/1.1" 200 7995
DEBUG: 0.0946: transaction prepared in 0.017, completed in 0.033
INFO: Expanded '*' into 2 nodes.
Nodes
n0
n1
DEBUG: Saved authentication cookie instead of logging out.
DEBUG: 0.0959: exiting, waited for 0.033 seconds
As can be seen in the above log, the authentication token from a previous run was loaded and used for the duration of command execution and then re-cached for later use.
logs/¶
The command line tools also log their arguments and some execution
progress in the ~/.scyldcw/logs/
folder. By default each tool
keeps logs of its previous five runs, though this number can be adjusted in
the settings.ini
file by resetting the logging.max_user_logs
value.
Set this value to zero to discard all logs,
and set to a negative number to preserve logs indefinitely.
Administrators may be asked to provide these logs
(usually via the scyld-sysinfo
tool) when requesting assistance
from Penguin Computing technical support.
workspace/¶
The ~/.scyldcw/workspace/
folder is used by the scyld-modimg
tool to store, unpack, and manipulate image contents.
Root file system images are large,
which means this local image cache can grow large.
Administrators are encouraged to delete unneeded entries in the cache using
the scyld-modimg --delete
command,
either with the -i
(or --image
) argument to name specific images,
or with --all
to delete all local images.
This will not delete the remote copies of images stored on the head nodes,
just delete the local cache.
Within this folder, the manifest.json
file contains JSON
formatted information about the cached images, while the images
themselves are stored as individual packed files with names based upon their
UID. If the cached images are ever out of sync with the manifest,
i.e. a file is missing or an extra file is present, then the
scyld-modimg
tool will print a warning:
WARNING: Local cache contains inconsistencies. Use --clean-local
to delete temporary files, untracked files, and remove missing
files from the local manifest.
This warning can be automatically cleared by running the tool with the
--clean-local
option. This is not done automatically in case some
useful image or other data might be lost. Alternatively, if the
manifest.json
is somehow lost, a new file can be constructed for a
collection of images using the --register-all
option. See the
command documentation for more details.
The location of the workspace folder can be controlled on the
scyld-modimg
command line or by the modimg.workspace variable
in the settings.ini
file.
parse_failures/¶
Several ClusterWare tools execute underlying Linux commands, such as
rpm or yum, and parse their output to check for details of success or
failure. During execution and parsing, the stdout and stderr of the
Linux commands are cached in the ~/.scyldcw/parse_failures/
folder. If the parsing completes, regardless of the command success or
failure, these files will be deleted, but when a tool crashes or
parsing fails, these files will be left behind. Though not generally
useful to an administrator during normal operation, these output files
could be useful for debugging problems and may be requested by Penguin
Computing technical support.
Much like files in the ~/.scyldcw/logs/
folder,
these parse failures can be periodically purged if no problems are
encountered, though be aware that useful debugging information may be
lost.
The /opt/scyld/clusterware/
Folder¶
The /opt/scyld/clusterware
folder exists only on a head node and
contains the core ClusterWare installation. Selected contents are
described below.
/opt/scyld/clusterware/bin/¶
Tools located in the bin/
folder are intended to be run as root
only on head nodes and are rarely executed directly. This is where the
managedb
tool is located, as well as the pam_authenticator
application described in the Installation & Administrator Guide and the
randomize_ini
script executed during initial installation.
/opt/scyld/clusterware/conf/¶
The conf/
folder contains the principal configuration files for
ClusterWare REST API.
In that folder the httpd_*.conf
files are used in the
actual Apache configuration, while the INI files control the behavior
of the Python Pyramid-based service. Modifications to any of these
files requires the administrator to restart the clusterware
service. Also note that modifications to these files only affect the
one head node and may need to be replicated to other head nodes in
multihead configurations. Because of this, future releases may move
selected variables from the base.ini
file into the
ClusterWare database to provide a cluster-wide effect.
Many aspects of the REST service can be tweaked through changes to
variables in the base.ini
, and these are discussed throughout this
documentation. To list all available variables please use the
managedb
tool:
sudo /opt/scyld/clusterware/bin/managedb --print-options
This command will list all options registered with the configuration system, and although many of these options are for internal use only, Penguin Computing technical support may suggest changes in individual cases. The specific variables available and their effects may change in future releases.
The variable names take a general form of
SUBSYSTEM.VARIABLE
or PLUGIN.VARIABLE
. For example, the
plugins
subsystem is controlled through these variables, and
a specific authentication plugin is selected by the plugins.auth
variable. Further, what application the appauth
plugin uses is
controlled by the appauth.app_path variable. For a description of
this specific plugin, please see Securing the Cluster.
Other variables in the base.ini
file follow similar patterns.
Variables in the production.ini
file are used to control aspects
of the Python Pyramid framework, specifically logging. Variables in
this file are also for internal use and should not be modified except
by the suggestion of Penguin Computing technical support.
/opt/scyld/env/, modules/, and src/¶
The env/
, modules/
, and src/
folders contain the Python
virtual environment, including the libraries required by the
scyld-*
and other tools.
/opt/scyld/clusterware/parse_failures/¶
Similar to the individual administrator ~/scyldcw/parse_failures/
,
files in this folder will accumulate any parsing failures found while
running underlying Linux commands and should generally be empty. If
files are accumulating here, it is safe to delete them, but the
ClusterWare developers should be informed and may request a sample of
the files to diagnose the underlying failure.
/opt/scyld/clusterware/storage/¶
The storage/
folder is the default location used by the
local_files
plugin to store kernels, initramfs files, and packed
root file systems. The actual location of this folder is controlled by
the local_files.path variable in the base.ini
configuration
file.
This folder can grow relatively large depending on the size and
quantity of root file systems in the cluster. Most organizations will
want to include the storage
folder in their backup planning along
with the database contents obtained through scyld-install --save
or the managedb save
command.
See Backup and Restore for additional discussion of backup up
the database contents.
/opt/scyld/clusterware/workspace/¶
The REST service running on each head node requires a location to
hold temporary files.
This location is controlled by the head.workspace variable and defaults to
/opt/scyld/clusterware/workspace/
.
Like the storage/
directory, workspace/
can grow to relatively large
size, but unlike storage/
does not need to be backed up.
Any files or directories found in this folder are
temporary and should be deleted when the service is shut down or
restarted. If files or folders accumulate in this folder, they are safe
to remove, although this must be done carefully or when the REST
service is stopped. If files do accumulate here, please notify Penguin
Computing developers so that we may diagnose the underlying issue.
Compute Node Initialization Scripts¶
All compute node images should include the clusterware-node
package. This package includes systemd services used for periodically
reporting node status back to the head node as well as initialization
scripts run as the node is booting.
At the end of the boot process described in Node Images and Boot Configurations,
the mount_rootfs
script
hands control of the machine over to the standard operating system
initialization scripts when it switches to the newly mounted root.
Shortly after networking is established on the booting node,
it contacts the parent head node,
the compute node begins periodic pushes of status information to
the parent, which stores that information in the ClusterWare
database. The first data push includes detected hardware information,
while subsequent data only contains the more ephemeral node status
information. With each status update the node also retrieves its
attribute list and stores this list as an INI file at
/opt/scyld/clusterware-node/etc/attributes.ini
. Code running on
the compute node can use the contents of this file to customize the
node configuration. A simple attributes.ini
file:
[Node]
UID = c1bf15749d724105bce9e07a3d79cb69
[Attributes]
_boot_config = DefaultBoot
The [Node]
section will include node-specific details, while the
[Attributes]
section contains the node attributes as determined
from the node's groups using the process described in Node Attributes.
The clusterware-node
package also contains a symlink at /etc/clusterware
pointing to
/opt/scyld/clusterware-node/etc/
.
Shortly after the first status push, a series of shell scripts are
executed on the node to perform ClusterWare-specific node
initialization. These scripts are linked in
/opt/scyld/clusterware-node/scripts-enabled
and located in
/opt/scyld/clusterware-node/scripts-available
.
All such scripts should include
/opt/scyld/clusterware-node/functions.sh
for common variables and
functions, and should use the attributes.ini
described previously to
determine what actions are necessary. Cluster administrators are
invited to enable and disable these scripts in their root file system
images as they see fit and to contribute improved or added scripts
back to the ClusterWare developers for the continuing improvement of
the product.