Database Objects Fields and Attributes¶
Various ClusterWare database objects (e.g., nodes, boot configurations, image configurations, administrators, attributes) each carry with them detailed descriptors called fields. Each field consists of a name-value pair and is relevant for its database object type. Fields are predefined by ClusterWare. The cluster administrator uses the update action to change a field value.
For instance, a compute node object for each node has fields
mac with the node's MAC address, name with the node's alphanumeric name,
and power_uri with a value denoting how to communicate via ipmi to that node.
For example, the command scyld-nodectl -i n0 ls -l
displays all the defined fields' name-value pairs for node n0.
Compute node and Attribute Groups object types have special fields called attributes, where an attribute is a collection of one or more attribute name-value pairs. Attribute names that begin with an underscore "_" are called reserved attributes or system attributes. The cluster administrator uses the set action to change an attribute value. See the following section Reserved Attributes for details.
Additional attributes can be added by a cluster administrator as desired,
each with a custom name and value defined by the administrator.
Any script on a compute node can access the local file
/etc/clusterware/attributes.ini
and find that node's attributes.
On the node there are helper functions in
/opt/scyld/clusterware-node/functions.sh
for reading attributes,
specifically the function attribute_value.
Reserved Attributes¶
Within the ClusterWare attribute system, administrators are encouraged to store whatever information they find useful for labeling and customizing nodes. For ease of use, attributes names should be valid Javascript variable names, i.e., meaning that they may begin with any uppercase or lowercase letter, followed by letters, digits, or underscores. Names that start with an underscore are used by ClusterWare and should be set by administrators to affect the behavior of the system. These will be referred to as system attributes throughout this discussion.
Attributes are stored internally as a Javascript dictionary mapping strings to strings, otherwise known as name-value pairs. Administrator-defined attribute values should be strings and relatively small in size. The ClusterWare backend database enforces some document size constraints, and collections of node attributes should be no more than tens to hundreds of kilobytes in size. Individual attributes can be any length as long as the overall attribute group or node object size does not exceed this limits. Generally, if a cluster configuration is approaching these sizes, a cluster administrator pursue moving data from the database into shared storage locations referenced by database entries.
Attributes can be applied directly to nodes, but may also be collected into groups, and then these groups applied to sets of nodes. Attributes passed to nodes through groups are treated no differently than those applied directly to a node. Attribute groups help cluster administrators create more scalable and manageable configurations. See Node Attributes for more details.
The remainder of this section is a list of system attributes describing their use and allowed values.
_ansible_pull¶
Default: none
Values: reference to an ansible git repo and a playbook in that repo
Depends: none
See Appendix: Using Ansible for details about format and usage.
_ansible_pull_args¶
Default: none
Values: optional arguments for _ansible_pull
Depends: using _ansible_pull
Specify optional arguments for an _ansible_pull. See Appendix: Using Ansible for details about format and usage.
_ansible_pull_now¶
Default: none
Values: reference to an ansible git repo and a playbook in that repo
Depends: none
The cluster administrator must systemctl enable cw-ansible-pull-now
and systemctl start cw-ansible-pull-now
.
See Appendix: Using Ansible for details about format and usage.
_bmc_pass¶
Default: none
Values: password to the node BMC
Depends: none
This attribute is not directly used within ClusterWare but is meant to
be referenced in a power_uri such as
ipmi:///admin:<attributes[_bmc_pass]>@10.10.10.10
. This attribute is masked by
default so the password will not be printed to the terminal by
scyld-nodectl
without the --no-pretty
argument.
_bootloader¶
Default: none
Values: bootloader to install, currently only "grub"
Depends: _boot_style=disked
Setting this attribute while using a disked boot style will trigger code in the initramfs to install the requested bootloader to the disk containing the partition that contains the /boot directory, append necessary entries into /etc/fstab based on then-mounted partitions, set the _boot_style to sanboot, and reboot the system.
This option is commonly coupled with the _ignition attribute to provide partitioning and filesystem creation. Using these attributes together allows for deploying images as persistent installations for infrastructure nodes.
_bootnet¶
Default: bootnet
Values: name for the system boot interface
Depends: none
By default, a system will rename whatever interface is used for network booting to "bootnet" but that name can be changed by setting this attribute. Be careful not to use a name already in use on the target system.
_busy¶
Default: undefined (see below)
Values: boolean (case-insensitive 1/0, on/off, y/n, yes/no, t/f, true/false)
Depends: none
This attribute explicitly controls the behavior of the compute node's cw-status-updater service which periodically gathers node state information every _status_secs seconds (default 10) and reports that information to its parent head node.
The cw-status-updater service can function in one of two ways:
The default manner that gathers frequently changing state (e.g.,
uptime
and load average) and occasionally gathers (albeit more expensively) infrequently changing state information (e.g., what hardware is present and which ClusterWare packages are currently installed), orA "busy mode" manner that severely reduces the scope of what information is gathered and reported. The service in "busy mode" is minimally invasive to performance of real-time (especially multi-node) applications that are sensitive to interruptions and to "jitter".
If _busy is undefined,
then "busy mode" can be enabled or disabled by the presence or absence of
/opt/scyld/clusterware-node/etc/busy.flag
,
which can be created in a job scheduler prologue and delete in an epilogue,
or can be manually created and deleted.
If neither _busy and busy.flag
are employed,
then the compute node may itself heuristically determine on its own whether
or not to execute in "busy mode".
A compute node in "busy mode" reports that with
scyld-nodectl status -l
showing "busy: True".
_boot_config¶
Default: none
Values: boot configuration identifier
Depends: none
The _boot_config attribute defines what boot configuration a given node should should use. For a detailed discussion of boot configurations and other database objects, please see Node Images and Boot Configurations.
A boot configuration identifier may be a, possibly truncated, UID or a boot configuration name.
_boot_rw_layer¶
Default: overlayfs
Values: overlayfs, rwtab
Depends: _boot_style == roram or iscsi
Use _boot_rw_layer to control the type of overlay used to provide
read/write access to an otherwise read-only root file system
image. The overlayfs provides a writable overlay across the entire
file system, while the rwtab approach only allows write access to the
locations defined in /etc/rwtab
or /etc/rwtab.d
in the node
image.
Note that prior to kernel version 4.9, overlayfs does not support
SELinux extended attributes and so cannot be used for compute nodes
with SELinux in enforcing mode. The rwtab option does work with
SElinux, but two additional changes need to be made when enabling
rwtab. First, the cluster administrator must modify the
/etc/sysconfig/readonly-root
file in the node image to ensure
READONLY is set to "yes":
READONLY=yes
Second, the kernel cmdline in the appropriate boot configuration must include "ro":
cmdline: enforcing=1 ro
_boot_style¶
Default: rwram
Values: rwram, roram, iscsi, disked, next, sanboot, live
Depends: none
Root file system images can be supplied to nodes through a variety of mechanisms, and this can be controlled on a per-node basis through the _boot_style attribute. In both the rwram and roram modes, the node will download the entire image into RAM and either unpack it into a tmpfs RAM file system (rwram) or apply a writable overlay (roram). These boot styles have the advantage of post-boot independence from the head node, meaning that the loss of a head node will not directly impact booted compute nodes.
The iscsi option uses less RAM as the boot image is not downloaded into node RAM, but depends on the head node even after the node is fully booted. Due to this dependence a head node crash may cause attached compute nodes to hang and lose work. This approach requires a writable overlay, as the images may be shared between multiple nodes.
With the disked option, the node boots with images read from local storage. See Appendix: Booting From Local Storage Cache for details.
Use the next option to exit the boot loader and allow the BIOS to try the next device in the BIOS boot order. Since this process depends on support in the BIOS, it may not work on every server model.
The sanboot option causes the booting node to boot using the iPXE
sanboot
command and defaults to booting the first hard
disk. Please see the _ipxe_sanboot attribute for more details.
The live option only works for ISO-based configurations, e.g., those used for kickstart. For supported ISOs (e.g., RHEL-based) the node boots into the live installer, and the administrator needs to interact with it via the (likely graphical) system console.
_boot_tmpfs_size¶
Default: half of RAM
Values: 1g, 2g, etc.
Depends: _boot_style == rwram or _boot_rw_layer == overlayfs
During the node boot process, a tmpfs is used to provide a writable area for diskless compute nodes. For the rwram boot style this attribute controls the size of the root file system where the image is unpacked. When booting with overlayfs on a roram or iscsi style, this attribute controls the size of the writable overlay.
_coreos_ignition_url¶
Default: none
Values: The URL of a RHCOS *.ign ignition file.
Depends: none
Both _coreos_ignition_url and _coreos_install_dev are attributes that must be set to fill in variables in the associated boot config's cmdline. See Using RHCOS.
_coreos_install_dev¶
Default: none
Values: The device on the target node into which the image is installed.
Depends: none
Both _coreos_ignition_url and _coreos_install_dev are attributes that must be set to fill in variables in the associated boot config's cmdline. See Using RHCOS.
_disk_cache¶
Default: none
Values: local partition name + optional encryption
Depends: none
Specifies a persistent location where the node can store downloaded images. This location should be a local partition with sufficient size to hold a handful of compressed images.
If the specified location exists, then the node will retain there a copy of the downloaded image. During subsequent boots the node will first compare the checksum of a file previously saved with the expected checksum provided by the head node in order to avoid unnecessary downloads.
If the specified partition does not exist, then an error will be logged, although the node will download the image to RAM and still boot. If the partition exists but cannot be mounted, then it will be reformatted.
Optionally Linux Unified Key Setup (LUKS) encryption can
also be specified for the partition.
Append :encrypt
to the partition name to encrypt with a random key,
or append :encrypt=KEY
to specify an encryption key.
If no key is specified,
encryption is performed using standard LUKS tools with 128 bytes of data
from /dev/urandom
stored in a key file used as the passphrase.
This key file is only briefly stored in RAM and deleted shortly before an
Ext4 file system is created on the newly encrypted partition.
Alternatively, if the specified key is TPM
then the random key
will be stored in the booting system's Trusted Platform Module (TPM)
and deleted out of RAM shortly before the file system is created. The
key can also be bound to specific TPM Platform Configuration Register
(PCR) values meaning that the TPM will not later reveal the key unless
those PCRs hold the same values. Since these values include hashes of
the BIOS code, configuration, kernel, and other boot-time binaries
access to the encrypted partition can be restricted to specific
boot-time configurations. If the TPM has an owner password set it must
be provided in the _tpm_owner_pass attribute.
Note
The cryptsetup-luks package must be installed in the image being booted.
Specifying a KEY is essentially necessary for _disk_cache because without that after a subsequent reboot the partition contents will be lost as they were encrypted with an unknown random key.
For example:
scyld-nodectl -i n[0-63] set _disk_cache=/dev/nvme0n1p2:encrypt=Penguin
If _disk_cache is present but no _disk_root is provided, then if a roram-compatible image is downloaded, then the node will boot directly from the cached image with a writable overlay.
Important
Any data in the partition specified as a _disk_cache may be destroyed at boot time!
Similar to /etc/fstab
, partitions can be identified by device path,
UUID, PARTLABEL, or PARTUUID.
_disk_root¶
Default: none
Values: local partition name + optional encryption
Depends: ignored unless _boot_style == disked
Specifies a persistent location into which at boot time the node can unpack the root image. This will delete the contents of the partition before unpacking the root image. If the specified partition does not exist, then an error will be logged, although the node will still boot using the image unpacked into RAM.
Similar to _disk_cache,
append :encrypt
to the partition name to encrypt with a random key,
or :encrypt=KEY
to specify the encryption key.
For _disk_root a random key is preferable,
as the _disk_root contents are intended to be ephemeral across boots.
Important
All data in the partition specified as a _disk_root will be destroyed at boot time!
Similar to /etc/fstab
, partitions can be identified by device path,
UUID, PARTLABEL, or PARTUUID.
_disk_wipe¶
Default: none
Values: comma-separated list of local partition names + optional encryption
Depends: none
The listed partitions will be reformatted at every boot
with an Ext4 file system.
Similar to _disk_cache,
append :encrypt
to the partition name to enable "encryption at rest",
or :encrypt=KEY
to specify the encryption key.
Like _disk_root the random key is preferable to ensure _disk_wipe partition
contents are not retrievable from a physically removed storage device.
_domain¶
Default: none
Values: a DNS domain name
Depends: none
Booting compute nodes will use this attribute when constructing their full names unless their _hostname attribute already includes their domain.
_gateways¶
Default: The default gateway for the node's interfaces
Values: <ifname>=IPaddress
Depends: None
Override the interface ifname's current gateway value with an alternative IP address. For example, _gateways=enpls0f0=10.20.30.40,enpls0f1=10.20.40.40.
_hardware_plugins¶
Default: 300
Values: Comma-separated list of hardware plugin modules
Depends: None
Specifies a list of hardware plugins that are added to the list that might be built into the disk image. If a plugin is listed twice, the second listing will be silently ignored; if a plugin does not exist, it will be silently ignored; if a plugin returns an error or outputs no data, it will be silently ignored.
Hardware information is assumed to be changing less frequently and results may be cached to further reduce the load of the monitoring system.
_hardware_secs¶
Default: 300
Values: seconds between checking for status hardware changes
Depends: none
A node sends its hardware state (viewed with scyld-nodectl list --long
and list --long-long
) as a component of its larger basic status
information. See _status_secs
below. This hardware component is
typically only sent once at boot time. However, the node periodically
reevaluates its hardware state every _status_hardware_secs seconds,
and in the rare event that something has changed since it last
communicated its hardware state to its parent head node, then the
node includes the updated hardware information in its next periodic
basic status message.
Changes to this value are communicated to an up node without needing to reboot the node.
_health¶
Default: none
Values: node health status
Depends: none
Cluster administrators can use a health check tool that periodically
executes on a compute node (see _health_check) and relays the result
back to the head node as a value of _health.
The health check tool is expected to return a _health result string in one
of three forms:
an integer value of seconds-since-epoch (generated by date +%s
),
a no-problems-detected value of "healthy",
or some other string that provides more details about a problem or
problems encountered.
The scyld-nodectl
tool can display the literal _health value doing:
scyld-nodectl -i n42 --fields attributes._health ls -l
Alternatively,:
scyld-nodectl status --health [--refresh]
and the ClusterWare GUI display a simple summary of the literal _health value. The seconds-since-epoch value is displayed as "checking", the "healthy" value is displayed as "healthy", and any other value is displayed as "unhealthy".
The health check tool can set a custom _health value to provide more detailed information about the problem was was discovered, e.g.,
_health="Sent back to Penguin with RMA #123456"
or
_health="GPU2 is unhealthy"
_health_check¶
Default: /opt/scyld/clusterware-node/bin/check-health-basic.sh
Values: path to health check tool on node
Depends: none
Cluster administrators use the _health_check attribute to specify the path to
a script or binary executable that implements the health check for display in
the _health attribute (see _health).
The default /opt/scyld/clusterware-node/bin/check-health-basic.sh
tool
is duplicated on a head node as
/opt/scyld/clusterware-tools/examples/check-health-basic.sh
to provide a prototype for the cluster administrator to copy and modify as
desired, and then deploy to compute node(s),
or to install into an image file or files,
and then set _health_check for specified nodes to point to the path
of this alternative tool.
When a health check tool begins executing on the node, it should initially return a _health value of the current seconds since epoch, e.g.,
set-node-attribs _health=$(date +%s)
The ClusterWare GUI and scyld-nodectl status --health both interpret
this seconds-since-epoch value as "checking".
At completion of the health check, the "healthy", "unhealthy", or more
elaborate string result should be sent back to the head node in the
using the same set-node-attribs _health=<value>
mechanism.
_health_plugins¶
Default: None
Values: Comma-separated list of health plugin modules
Depends: None
Specifies a list of health plugins that are added to the list that might be built into the disk image. If a plugin is listed twice, the second listing will be silently ignored; if a plugin does not exist, it will be silently ignored; if a plugin returns an error or outputs no data, it will be silently ignored.
Health checks are assumed to be changing much less frequently and results may be cached to further reduce the load of the monitoring system.
_health_secs¶
Default: 300
Values: Number of seconds between health-check runs
Depends: None
Specifies the time between successive health update cycles. During each cycle, every health plugin will be run.
_health_check_secs¶
default: 300
Depends: none
Values: seconds between executing the health check program
Depends: none
Specifies the interval in seconds for executing a health check program specified by the _health_check attribute. See _health_check.
_hostname¶
Default: none
Values: Hostname or fully-qualified domain name
Depends: none
Booting compute nodes will assign the value of _hostname as their
hostname using the hostnamectl
command. If the attribute value is a
simple name (without periods), then the cluster domain will be appended
to construct a FQDN. Changes to this variable take effect on the next
status update.
_hosts¶
Default: blank
Values: download
Depends: none
During the compute node boot process, a list of known hosts is
downloaded from the head node and is appended to the compute node's
/etc/hosts
.
By default this will only
append a list of head nodes to ensure that each compute node can
resolve all head nodes without DNS. If the _hosts attribute is set to
'fetch', then all compute node names and IP addresses will be
appended to /etc/hosts
.
_ignition¶
Default: none
Values: A filename in the kickstarts/ folder
Depends: none
Compute node disks can be partitioned early in the boot process using
the included ignition
tool. Setting the _ignition attribute
instructs the booting node to download the ignition
binary from
the head node and then use it to download the configuration named by
the _ignition attribute. At different points in the boot process,
ignition
will execute its fetch, kargs, disks, mount, and
files stages. The ignition
configuration file is not meant to be
human readable or editable so administrators are expected to write
YAML files in the format that the butane
tool can translate. If
the provided configuration file ends in .butane it will be converted
automatically by the ClusterWare backend at the time of download.
The ignition
tool provides several other capabilites including the
ability to mount the created partitions. Using the _disk_root and
_ignition attributes together an administrator can configure a node
using the disked boot style with directories such as /var, /usr,
etc. on different partitions as required by government STIGs.
Additional documentation about ignition
can be found at:
https://coreos.github.io/ignition/
_ips¶
Default: none
Values: comma-separated IP assignments
Depends: none
Compute nodes commonly define additional high-speed network interfaces other than the PXE boot network. These interfaces are commonly defined by ifcfg-XXX files located in /etc/sysconfig/network-scripts and differ between nodes only in the assigned IP address. Use the _ips attribute to specify what IP address should be assigned to an individual node on one or more interfaces. For example, a value of _ips=eno0=10.10.23.12,ib0=192.168.24.12 would cause the prenet/write_ifcfg.sh startup script to replace any IPADDR= line in /etc/sysconfig/network-scripts/ifcfg-ib0 with IPADDR=192.168.24.12 and would similarly modify the adjacent ifcfg-en0 file, replacing any IP assignment in that file with IPADDR=10.10.23.12.
_ipxe_sanboot¶
Default: none
Values: local disk or partition
Depends: _boot_style == sanboot
Use this attribute to cause a node to boot using the iPXE sanboot
command. This is most commonly used to boot a locally installed disk,
although administrators are cautioned to be extremely careful with
stateful compute nodes as they will retain modifications from
previous boots, leading to an unexpectedly heterogeneous cluster.
Nodes with this attribute set will not download an image from the head
node and will instead boot based on the URL or other iPXE sanboot
arguments provided. Please see the iPXE documentation for the
details of what iPXE provides: http://ipxe.org/cmd/sanboot
In addition to the arguments and URLs supported by iPXE, ClusterWare
also accepts a shorter URL for booting local disks of the form
local://0xHH
where 'HH' is a hexadecimal value specifying a local hard
disk. The first disk is identified as 0x80, the second is 0x81, and so
on. The provided hexadecimal value is then used in a
sanboot --no-describe --drive 0xHH
call.
_macs¶
Default: The default MAC address for each of the node's interfaces
Values: <ifname>=<MACaddress>
Depends: None
Override the interface ifname's current MAC address with an alternative value. For example, _macs=bond0=aa:bb:cc:dd:ee:ff. Generally only used for bonded interfaces. Ignored for the booting interface bootnet.
_no_boot¶
Default: false
Values: boolean equivalents (0 / 1, true / false, t / f, yes / no, y / n)
Depends: none
The _no_boot attribute controls whether information about a node is provided to the DHCP server. Any node with _no_boot set to true will not receive DHCP offers from any ClusterWare head node. This allows an administrator to temporarily remove a node from the cluster.
_preferred_head¶
Default: none
Values: head node UID
Depends: none
In a multihead configuration any head node can provide boot files to any compute node in the system. In most cases this is a desirable feature because the failure of any given head node will not cause any specific set of compute nodes to fail to boot. In some cases the cluster administrator may want to specify a preference of which head node should handle a given compute node. By setting a compute node's _preferred_head attribute to a specific head node's UID, all head nodes will know to point that node toward the preferred head node. This is implemented during the boot process when the iPXE script is generated and passed to the compute node. This means that any head node can still supply DHCP, the iPXE binaries, and the iPXE boot script, but the subsequent kernel, initramfs, and root file system files will be provided by the preferred head node, and thereafter the node's boot status information will be sent to that _preferred_head.
_remote_pass¶
Default: none
Values: node account password for _remote_user attribute
Depends: none
Supports an alternative to the customary ClusterWare ssh-key functionality.
It is useful to support scyld-nodectl exec
to non-ClusterWare
compute nodes which do not have clusterware-node installed,
but which do accept user/password authentication.
To use, install the sshpass RPM on the head node.
Set the _remote_pass attribute to the password of the _remote_user attribute
user name (default root).
Subsequent executions of scyld-nodectl exec
to nodes that are set up with
this attribute will employ this user/password pair to authenticate access
to those target node(s).
Note
Use of sshpass is discouraged and is not a best practice. A clear text password is a significant security risk.
_remote_user¶
Default: root
Values: node account name
Depends: none
The _remote_user attribute controls what account is used on the
compute node when executing the scyld-nodectl reboot/shutdown
commands. Please ensure the specified account can execute sudo
shutdown
without a password or soft power control will not
work. Similarly the scyld-nodectl exec
and scyld-nodectl ssh
commands will also use the specified remote user account and the
boot-time script that downloads head node keys will store those keys
in the _remote_user's authorized_keys file.
_sched_extra¶
Default: None
Values: short line of text with a bit more information on the state of the scheduler
Depends: sched_watcher
service must be running on the cluster
Gives one line of information on the current state of the node with respect to the scheduler. E.g. if a node is down, it may include whether the node could be pinged, or whether the scheduler-daemon on the node was found.
Note: _sched_extra
is a "technology preview" and may change
or be replaced in the future.
_sched_full¶
Default: None
Values: JSON table of information
Depends: sched_watcher
service must be running on the cluster
Specifies all the information known about the current state of the node with respect to the scheduler. E.g. it may report the number of CPUs or memory seen by the scheduler; or any additional resources (like GPUs) that were found on the system.
Note: _sched_full
is a "technology preview" and may change
or be replaced in the future.
_sched_state¶
Default: None
Values: one of unknown
, down
, idle
, or allocated
Depends: sched_watcher
service must be running on the cluster
Specifies the current state of the node with respect to the scheduler, e.g. Slurm.
Note: _sched_state
is a "technology preview" and may change
or be replaced in the future.
_status_cpuset¶
Default: all available CPUs
Values: list of one or more CPU numbers
Depends: none
When a compute node boots, the status-updater
and related child processes
can execute by default on any of the node's CPUs, as chosen by the kernel's
scheduler.
The administrator may instead choose to restrict which CPUs these processes
use to be a subset of all CPUs, or even to just a single CPU,
in order to minimize the impact that these processes may have on a
time-critical application(s) executing on the other CPUs.
The _status_cpuset value is a list of CPUs to use.
For example, set _status_cpuset=0
restricts the processes to just CPU 0,
set _status_cpuset="0-1"
restricts to CPUs 0 and 1,
and set _status_cpuset="0-1,4"
restricts to CPUs 0, 1, and 4.
See man 7 cpuset
for details.
_status_hardware_secs¶
Default: 300
Values: seconds between checking for status hardware changes
Depends: none
A node sends its hardware state (viewed with scyld-nodectl list --long
and list --long-long
) as a component of its larger basic status information.
See _status_secs above.
This hardware component is typically only sent once at boot time.
However, the node periodically reevaluates its hardware state every
_status_hardware_secs seconds,
and in the rare event that something has changed since it last communicated
its hardware state to its parent head node, then the node includes the
updated hardware information in its next periodic basic status message.
Changes to this value are communicated to an up node without needing to reboot the node.
_status_packages_secs¶
Default: 0
Values: seconds between checking for installed packages changes
Depends: none
The time interval in seconds between the relatively expensive search for what
Scyld packages are installed.
This value times 10 is the time interval between the even more expensive
calculations of a sha256sum hash of the sorted list of names of all installed
packages, distilled into a single hexidecimal value.
These values are seen by executing scyld-nodectl -i<nodes> status -L
on the head node.
A non-zero value should be longer than the _status_secs value, described below.
If the value is zero, then these packages searches and calculations are done
just at node boot time, and additionally when (and if) the administrator
executes /usr/bin/update-node-status --hardware
on a compute node.
Such run-time changes to a node's installed packages are relatively rare,
so the default value is zero to minimize the performance impact of these
operations.
Changes to this value are communicated to an up node without needing to reboot the node.
_status_plugins¶
Default: None
Values: Comma-separated list of status plugin modules
Depends: None
Specifies a list of status plugins that are added to the list that might be built into the disk image. If a plugin is listed twice, the second listing will be silently ignored; if a plugin does not exist, it will be silently ignored; if a plugin returns an error or outputs no data, it will be silently ignored.
_status_secs¶
Default: 10
Values: seconds between status updates
Depends: none
Booted compute nodes periodically send basic status information to their parent head node. This value controls how often these messages are sent. Although the messages are relatively small, clusters with more compute nodes per head node will want to set this to a longer period to reduce load on the compute nodes.
Changes to this value are communicated to an up node without needing to reboot the node.
_telegraf_omit_pattern¶
Default: (_sched_full|_telegraf_omit_pattern)
Values: regex matching pattern (awk
rules)
Depends: None
Specifies a pattern to match for the Telegraf cw-attribs
plugin. Any compute node attributes or fields that match the
pattern will be omitted from the Telegraf data-stream (all
remaining fields and attributes will be included).
_telegraf_plugins¶
Default: None
Values: Comma-separated list of Telegraf plugin modules
Depends: None
Specifies a list of Telegraf plugins that are added to the list that might be built into the disk image. If a plugin is listed twice, the second listing will be silently ignored; if a plugin does not exist, it will be silently ignored.
The Telegraf/Grafana system is used for whole system monitoring
and trending, and is not directly integrated with other
Clusterware tools (scyld-nodectl
will not report on Telegraf
data).
Changing _telegraf_plugins will cause a restart of Telegraf on the node.
_tpm_owner_pass¶
Default: none
Values: Owner password for the compute node TPM
Depends: none
Certain TPM commands require authentication using the "owner" TPM password. This means that the clear-text password must be provided to systems using the TPM for disk encryption via this attribute.