SLURM Release Information¶
The following is reproduced essentially verbatim from files contained within the SLURM tarball downloaded from https://slurm.schedmd.com.
SLURM was produced at Lawrence Livermore National Laboratory in collaboration
with various organizations.
Copyright (C) 2012-2013 Los Alamos National Security, LLC.
Copyright (C) 2011 Trinity Centre for High Performance Computing
Copyright (C) 2010-2015 SchedMD LLC
Copyright (C) 2009-2013 CEA/DAM/DIF
Copyright (C) 2009-2011 Centro Svizzero di Calcolo Scientifico (CSCS)
Copyright (C) 2008-2011 Lawrence Livermore National Security
Copyright (C) 2008 Vijay Ramasubramanian
Copyright (C) 2007-2008 Red Hat, Inc.
Copyright (C) 2007-2013 National University of Defense Technology, China
Copyright (C) 2007-2015 Bull
Copyright (C) 2005-2008 Hewlett-Packard Development Company, L.P.
Copyright (C) 2004-2009, Marcus Holland-Moritz
Copyright (C) 2002-2007 The Regents of the University of California
Copyright (C) 2002-2003 Linux NetworX
Copyright (C) 2002 University of Chicago
Copyright (C) 2001, Paul Marquess
Copyright (C) 2000 Markus Friedl
Copyright (C) 1999, Kenneth Albanowski
Copyright (C) 1998 Todd C. Miller <Todd.Miller@courtesan.com>
Copyright (C) 1996-2003 Maximum Entropy Data Consultants Ltd,
Copyright (C) 1995 Tatu Ylonen <ylo@cs.hut.fi>, Espoo, Finland
Copyright (C) 1989-1994, 1996-1999, 2001 Free Software Foundation, Inc.
Many other organizations contributed code and/or documentation without
including a copyright notice.
Written by:
Amjad Majid Ali (Colorado State University)
Par Andersson (National Supercomputer Centre, Sweden)
Don Albert (Bull)
Ernest Artiaga (Barcelona Supercomputer Center, Spain)
Danny Auble (LLNL, SchedMD LLC)
Susanne Balle (HP)
Anton Blanchard (Samba)
Janne Blomqvist (Aalto University, Finland)
David Bremer (LLNL)
Jon Bringhurst (LANL)
Bill Brophy (Bull)
Hongjia Cao (National University of Defense Techonogy, China)
Daniel Christians (HP)
Gilles Civario (Bull)
Chuck Clouston (Bull)
Joseph Donaghy (LLNL)
Chris Dunlap (LLNL)
Joey Ekstrom (LLNL/Bringham Young University)
Josh England (TGS Management Corporation)
Kent Engstrom (National Supercomputer Centre, Sweden)
Jim Garlick (LLNL)
Didier Gazen (Laboratoire d'Aerologie, France)
Raphael Geissert (Debian)
Yiannis Georgiou (Bull)
Andriy Grytsenko (Massive Solutions Limited, Ukraine)
Mark Grondona (LLNL)
Takao Hatazaki (HP, Japan)
Matthieu Hautreux (CEA, France)
Chris Holmes (HP)
David Hoppner
Nathan Huff (North Dakota State University)
David Jackson (Adaptive Computing)
Morris Jette (LLNL, SchedMD LLC)
Klaus Joas (University Karlsruhe, Germany)
Greg Johnson (LANL)
Jason King (LLNL)
Aaron Knister (Environmental Protection Agency)
Nancy Kritkausky (Bull)
Roman Kurakin (Institute of Natural Science and Ecology, Russia)
Eric Lin (Bull)
Don Lipari (LLNL)
Puenlap Lee (Bull)
Dennis Leepow
Bernard Li (Genome Sciences Centre, Canada)
Donald Lipari (LLNL)
Steven McDougall (SiCortex)
Donna Mecozzi (LLNL)
Bjorn-Helge Mevik (University of Oslo, Norway)
Chris Morrone (LLNL)
Pere Munt (Barcelona Supercomputer Center, Spain)
Michal Novotny (Masaryk University, Czech Republic)
Bryan O'Sullivan (Pathscale)
Gennaro Oliva (Institute of High Performance Computing and Networking, Italy)
Alejandro Lucero Palau (Barcelona Supercomputer Center, Spain)
Daniel Palermo (HP)
Dan Phung (LLNL/Columbia University)
Ashley Pittman (Quadrics, UK)
Vijay Ramasubramanian (University of Maryland)
Krishnakumar Ravi[KK] (HP)
Petter Reinholdtsen (University of Oslo, Norway)
Gerrit Renker (Swiss National Computer Centre)
Andy Riebs (HP)
Asier Roa (Barcelona Supercomputer Center, Spain)
Miguel Ros (Barcelona Supercomputer Center, Spain)
Beat Rubischon (DALCO AG, Switzerland)
Dan Rusak (Bull)
Eygene Ryabinkin (Kurchatov Institute, Russia)
Federico Sacerdoti (D.E. Shaw)
Rod Schultz (Bull)
Tyler Strickland (University of Florida)
Jeff Squyres (LAM MPI)
Prashanth Tamraparni (HP, India)
Jimmy Tang (Trinity College, Ireland)
Kevin Tew (LLNL/Bringham Young University)
Adam Todorski (Rensselaer Polytechnic Institute)
Nathan Weeks (Iowa State University)
Tim Wickberg (Rensselaer Polytechnic Institute)
Ramiro Brito Willmersdorf (Universidade Federal de Pemambuco, Brazil)
Jay Windley (Linux NetworX)
Anne-Marie Wunderlin (Bull)
CODE-OCEC-09-009. All rights reserved.
This file is part of SLURM, a resource management program.
SLURM is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free
Software Foundation; either version 2 of the License, or (at your option)
any later version.
SLURM is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
FOR A PARTICULAR PURPOSE. See the GNU General Public License for more
details.
You should have received a copy of the GNU General Public License along
with SLURM; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
OUR NOTICE AND TERMS OF AND CONDITIONS OF THE GNU GENERAL PUBLIC LICENSE
Our Preamble Notice
Auspices
This work performed under the auspices of the U.S. Department of Energy by
Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.
Disclaimer
This work was sponsored by an agency of the United States government.
Neither the United States Government nor Lawrence Livermore National
Security, LLC, nor any of their employees, makes any warranty, express
or implied, or assumes any liability or responsibility for the accuracy,
completeness, or usefulness of any information, apparatus, product, or
process disclosed, or represents that its use would not infringe privately
owned rights. References herein to any specific commercial products, process,
or services by trade names, trademark, manufacturer or otherwise does not
necessarily constitute or imply its endorsement, recommendation, or
favoring by the United States Government or the Lawrence Livermore National
Security, LLC. The views and opinions of authors expressed herein do not
necessarily state or reflect those of the United States government or
Lawrence Livermore National Security, LLC, and shall not be used for
advertising or product endorsement purposes.
The precise terms and conditions for copying, distribution and modification
is provided in the file named "COPYING" in this directory.
This file describes changes in recent versions of Slurm. It primarily
documents those changes that are of interest to users and administrators.
* Changes in Slurm 18.08.6-2
============================
-- Remove deadlock situation when logging and --enable-debug is used.
-- Fix RPM packaging for accounting_storage/mysql.
* Changes in Slurm 18.08.6
==========================
-- Added parsing of -H flag with scancel.
-- Fix slurmsmwd build on 32-bit systems.
-- acct_gather_filesystem/lustre - add support for Lustre 2.12 client.
-- Fix per-partition TRES factors/priority
-- Fix per-partition NICE priority
-- Fix partition access check validation for multi-partition job submissions.
-- Prevent segfault on empty response in 'scontrol show dwstat'.
-- node_features/knl_cray plugin - Preserve node's active features if it has
already booted when slurmctld daemon is reconfigured.
-- Detect missing burst buffer script and reject job.
-- GRES: Properly reset the topo_gres_cnt_alloc counter on slurmctld restart
to prevent underflow.
-- Avoid errors from packing accounting_storage_mysql.so when RPM is built
with out mysql support.
-- Remove deprecated -t option from slurmctld --help.
-- acct_gather_filesystem/lustre - fix stats gathering.
-- Enforce documented default usage start and end times when querying jobs from
the database.
-- Fix issues when querying running jobs from the database.
-- Deny sacct request where start time is later than the end time requested.
-- Fix sacct verbose about time and states queried.
-- burst_buffer/cray - allow 'scancel --hurry <jobid>' to tear down a burst
buffer that is currently staging data out.
-- X11 forwarding - allow setup if the DISPLAY environment variable lacks
a screen number. (Permit both "localhost:10.0" and "localhost:10".)
-- docs - change HTML title to include the page title or man page name.
-- X11 forwarding - fix an unnecessary error message when using the
local_xauthority X11Parameters option.
-- Add use_raw_hostname to X11Parameters.
-- Fix smail so it passes job arrays to seff correctly.
-- Don't check InactiveLimit for salloc --no-shell jobs.
-- Add SALLOC_GRES and SBATCH_GRES as input to salloc/sbatch.
-- Remove drain state when node doesn't reboot by ResumeTimeout.
-- Fix considering "resuming" nodes in scheduling.
-- Do not kill suspended jobs due to exceeding time limit.
-- Add NoAddrCache CommunicationParameter.
-- Don't ping powering up cloud nodes.
-- Add cloud_dns SlurmctldParameter.
-- Consider --sbindir configure option as the default path to find slurmstepd.
-- Fix node state printing of DRAINED$
-- Fix spamming dbd of down/drained nodes in maintenance reservation.
-- Avoid buffer overflow in time_str2secs.
-- Calculate suspended time for suspended steps.
-- Add null check for step_ptr->step_node_bitmap in _pick_step_nodes.
-- Fix multi-cluster srun issue after 'scontrol reconfigure' was called.
-- Fix accessing response_cluster_rec outside of write locks.
-- Fix Lua user messages not showing up on rejected submissions.
-- Fix printing multi-line error messages on rejected submissions.
* Changes in Slurm 18.08.5-2
============================
-- Fix Perl build for 32-bit systems.
* Changes in Slurm 18.08.5
==========================
-- Backfill - If a job has a time_limit guess the end time of a job better
if OverTimeLimit is Unlimited.
-- Fix "sacctmgr show events event=cluster"
-- Fix sacctmgr show runawayjobs from sibling cluster
-- Avoid bit offset of -1 in call to bit_nclear().
-- Insure that "hbm" is a configured GresType on knl systems.
-- Fix NodeFeaturesPlugins=node_features/knl_generic to allow other gres
other than knl.
-- cons_res: Prevent overflow on multiply.
-- Better debug for bad values in gres.conf.
-- Fix double accounting of energy at end of job.
-- Read gres.conf for cloud nodes on slurmctld.
-- Don't assume the first node of a job is the batch host when purging jobs
from a node.
-- Better debugging when a job doesn't have a job_resrcs ptr.
-- Store ave watts in energy plugins.
-- Add XCC plugin for reading Lenovo Power.
-- Fix minor memory leak when scheduling rebootable nodes.
-- Fix debug2 prefix for sched log.
-- Fix printing correct SLURM_JOB_ACCOUNT_PACK_GROUP_* in env for a Het Job.
-- sbatch - search current working directory first for job script.
-- Make it so held jobs reset the AccrueTime and do not count against any
AccrueTime limits.
-- Add SchedulerParameters option of bf_hetjob_prio=[min|avg|max] to alter the
job sorting algorithm for scheduling heterogeneous jobs.
-- Fix initialization of assoc_mgr_locks and slurmctld_locks lock structures.
-- Fix segfault with job arrays using X11 forwarding.
-- Revert regression caused by e0ee1c7054 which caused negative values and
values starting with a decimal to be invalid for PriorityWeightTRES and
TRESBillingWeight.
-- Fix possibility to update a job's reservation to none.
-- Suppress connection errors to primary slurmdbd when backup dbd is active.
-- Suppress connection errors to primary db when backup db kicks in
-- Add missing fields for sacct --completion when using jobcomp/filetxt.
-- Fix incorrect values set for UserCPU, SystemCPU, and TotalCPU sacct fields
when JobAcctGatherType=jobacct_gather/cgroup.
-- Fixed srun from double printing invalid option msg twice.
-- Remove unused -b flag from getopt call in sbatch.
-- Disable reporting of node TRES in sreport.
-- Re-enabling features combined by OR within parenthesis for non-knl setups.
-- Prevent sending duplicate requests to reboot a node before ResumeTimeout.
-- Down nodes that don't reboot by ResumeTimeout.
-- Update seff to reflect API change from rss_max to tres_usage_in_max.
-- Add missing TRES constants from perl API.
-- Fix issue where sacct would return incorrect array tasks when querying
specific tasks.
-- Add missing variables to slurmdb_stats_t in the perlapi.
-- Fix nodes not getting reboot RPC when job requires reboot of nodes.
-- Fix failing update the partition list of a job.
-- Use slurm.conf gres ids instead of gres.conf names to get a gres type name.
-- Add mitigation for a potential heap overflow on 32-bit systems in xmalloc.
CVE-2019-6438.
* Changes in Slurm 18.08.4
==========================
-- burst_buffer/cray - avoid launching a job that would be immediately
cancelled due to a DataWarp failure.
-- Fix message sent to user to display preempted instead of time limit when
a job is preempted.
-- Fix memory leak when a failure happens processing a nodes gres config.
-- Improve error message when failures happen processing a nodes gres config.
-- When building rpms ignore redundant standard rpaths and insecure relative
rpaths, for RHEL based distros which use "check-rpaths" tool.
-- Don't skip jobs in scontrol hold.
-- Avoid locking the job_list when unneeded.
-- Allow --cpu-bind=verbose to be used with SLURM_HINT environment variable.
-- Make it so fixing runaway jobs will not alter the same job requeued
when not runaway.
-- Avoid checking state when searching for runaway jobs.
-- Remove redundant check for end time of job when searching for runaway jobs.
-- Make sure that we properly check for runawayjobs where another job might
have the same id (for example, if a job was requeued) by also checking the
submit time.
-- Add scontrol update job ResetAccrueTime to clear a job's time
previously accrued for priority.
-- cons_res: Delay exiting cr_job_test until after cores/cpus are calculated
and distributed.
-- Fix bug where binary in cwd would trump binary in PATH with test_exec.
-- Fix check to test printf("%s\n", NULL); to not require
-Wno-format-truncation CFLAG.
-- Fix JobAcctGatherParams=UsePss to report the correct usage.
-- Fix minor memory leak in pmix plugin.
-- Fix minor memory leak in slurmctld when reading configuration.
-- Handle return codes correctly from pthread_* functions.
-- Fix minor memory leak when a slurmd is unable to contact a slurmctld
when trying to register.
-- Fix sreport sizesbyaccount report when using Flatview and accounts.
-- Fix incorrect shift when dealing with node weights and scheduling.
-- libslurm/perl - Fix segfault caused by incorrect hv_to_slurm_ctl_conf.
-- Add qos and assoc options to confirmation dialogs.
-- Handle updating identical license or partition information correctly.
-- Makes sure accounts and QOS' are all lower case to match documentation
when read in from the slurm.conf file.
-- Don't consider partitions without enough nodes in reservation,
main scheduler.
-- Set SLURM_NTASKS correctly if having to determine from other options.
-- Removed GCP scripts from contribs. Now located at:
https://github.com/SchedMD/slurm-gcp.
-- Don't check existence of srun --prolog or --epilog executables when set to
"none" and SLURM_TEST_EXEC is used.
-- Add "P" suffix support to job and step tres specifications.
-- When doing a reconfigure handle QOS' GrpJobsAccrue correctly.
-- Remove unneeded extra parentheses from sh5util.
-- Fix jobacct_gather/cgroup to work correctly when more than one task is
started on a node.
-- If requesting --ntasks-per-node with no tasks set tasks correctly.
-- Accept modifiers for TRES originally added in 6f0342e0358.
-- Don't remove reservation on slurmctld restart if nodes are removed from
configuration.
-- Fix bad xfree in task/cgroup.
-- Fix removing counters if a job array isn't subject to limits and is
canceled while pending.
-- Make sure SLURM_NTASKS_PER_NODE is set correctly when env is overwritten
by the command line.
-- Clean up step on a failed node correctly.
-- mpi/pmix: Fixed the logging of collective state.
-- mpi/pmix: Make multi-slurmd work correctly when using ring communication.
-- mpi/pmix: Fix double invocation of the PMIx lib fence callback.
-- mpi/pmix: Remove unneeded libpmix callback drop in tree-based coll.
-- Fix race condition in route/topology when the slurmctld is reconfigured.
-- In route/topology validate the slurmctld doesn't try to initialize the
node system.
-- Fix issue when requesting invalid gres.
-- Validate job_ptr in backfill before restoring preempt state.
-- Fix issue when job's environment is minimal and only contains variables
Slurm is going to replace internally.
-- When handling runaway jobs remove all usage before rollup to remove any
time that wasn't existent instead of just updating lines that have time
with a lesser time.
-- salloc - set SLURM_NTASKS_PER_CORE and SLURM_NTASKS_PER_SOCKET in the
environment if the corresponding command line options are used.
-- slurmd - fix handling of the -f flag to specify alternate config file
locations.
-- Fix scheduling logic to avoid using nodes that require a reboot for KNL
node change when possible.
-- Fix scheduling logic bug. There should have been a test for _not_
NODE_SET_REBOOT to continue.
-- Fix a scheuling logic bug with respect to XOR operation support when there
are down nodes.
-- If there is a constraint construct of the form "[...&...]"
then an error is generated if more than one of those specifications
contains KNL NUMA or MCDRAM modes.
-- Fix stepd segfault race if slurmctld hasn't registered with the launching
slurmd yet delivering it's TRES list.
-- Add SchedulerParameters option of bf_ignore_newly_avail_nodes to avoid
scheduling lower priority jobs on resources that become available during
the backfill scheduling cycle when bf_continue is enabled.
-- Decrement message_connections in stepd code on error path correctly.
-- Decrease an error message to be debug.
-- Fix missing suffixes in squeue.
-- pam_slurm_adopt - send an error message to the user if no Slurm jobs
can be located on the node.
-- Run SlurmctldPrimaryOffProg when the primary slurmctld process shuts down.
-- job_submit/lua: Add several slurmctld return codes.
-- job_submit/lua: Add user/group info to jobs.
-- Fix formatting issues when printing uint64_t.
-- Bump RLIMIT_NOFILE for daemons in systemd services.
-- Expand %x in job name in 'scontrol show job'.
-- salloc/sbatch/srun - print warning if mutually exclusive options of --mem
and --mem-per-cpu are both set.
* Changes in Slurm 18.08.3
==========================
-- Fix regression in 18.08.1 that caused dbd messages to not be queued up
when the dbd was down.
-- Fix regression in 18.08.1 that can cause a slurmctld crash when splitting
job array elements.
* Changes in Slurm 18.08.2
==========================
-- Correctly initialize variable in env_array_user_default().
-- Remove race condition when signaling starting step.
-- Fix issue where 17.11 job's using GRES in didn't initialize new 18.08
structures after unpack.
-- Stop removing nodes once the minimum CPU or node count for the job is
reached in the cons_res plugin.
-- Process any changes to MinJobAge and SlurmdTimeout in the slurmctld when
it is reconfigured to determine changes in its background timers.
-- Use previous SlurmdTimeout in the slurmctld after a reconfigure to
determine the time a node has been down.
-- Fix multi-cluster srun between clusters with different SelectType plugins.
-- Fix removing job licenses on reconfig/restart when configured license
counts are 0.
-- If a job requested multiple licenses and one license was removed then on
a reconfigure/restart all of the licenses -- including the valid ones
would be removed.
-- Fix issue where job's license string wasn't updated after a restart when
licenses were removed or added.
-- Add allow_zero_lic to SchedulerParameters.
-- Avoid scheduling tasks in excess of ArrayTaskThrottle when canceling tasks
of an array.
-- Fix jobs that request memory per node and task count that can't be
scheduled right away.
-- Avoid infinite loop with jobacct_gather/linux when pids wrap around
/proc/sys/kernel/pid_max.
-- Fix --parsable2 output for sacct and sstat commands to remove a stray
trailing delimiter.
-- When modifying a user's name in sacctmgr enforce PreserveCaseUser.
-- When adding a coordinator or user that was once deleted enforce
PreserveCaseUser.
-- Correctly handle scenarios where a partitions MaxMemPerCPU is less than
a jobs --mem-per-cpu and also -c is greater than 1.
-- Set AccrueTime correctly when MaxJobsAccrue is disabled and BeginTime has
not been established.
-- Correctly account for job arrays for new {Max/Grp}JobsAccrue limits.
* Changes in Slurm 18.08.1
==========================
-- Remove commented-out parts of man pages related to cons_tres work in 19.05,
as these were showing up on the web version due to a syntax error.
-- Prevent slurmctld performance issues in main background loop if multiple
backup controllers are unavailable.
-- Add missing user read association lock in burst_buffer/cray during init().
-- Fix incorrect spacing for PartitionName lines in 'scontrol write config'.
-- Fix creation of step hwloc xml file for after cpuset cgroup has been
created.
-- Add userspace as a valid default governor.
-- Add timers to group_cache_lookup so if going slow advise
LaunchParameters=send_gids.
-- Fix SLURM_STEP_GRES=none to work correctly.
-- Fix potential memory leak when a failure happens unpacking a ctld_multi_msg.
-- Fix potential double free when a faulure happens when unpacking a
node_registration_status_msg.
-- Fix sacctmgr show runaways.
-- Removed non-POSIX append operator from configure script for non-bash
support.
-- Fix incorrect spacing for PartitionName lines in 'scontrol write config'.
-- Fix sacct to not print huge reserve times when the job was never eligible.
-- burst_buffer/cray - Add missing locks around assoc_mgr when timing out a
burst buffer.
-- burst_buffer/cray - Update burst buffers when an association or qos
is removed from the system.
-- Remove documentation for deprecated Cray/ALPS systems. Please switch to
Native Cray mode instead.
-- Completely copy features when copying the list in the slurmctld.
-- PMIX - Fix issue with packing processes when using an arbitrary task
distribution.
-- Fix hostlists to be able to handle nodenames with '-' in them surrounded
by integers.
-- Fix correct job CPU count allocated.
-- Fix sacctmgr setting GrpJobs limit when setting GrpJobsAccrue limit.
-- Change the defaults to MemLimitEnforce=no and NoOverMemoryKill
(See RELEASE_NOTES).
-- Prevent abort when using Cray node features plugin on non-knl.
-- Add ability to reboot down nodes with scontrol reboot_nodes.
-- Protect against sending to the slurmdbd if the connection has gone away.
-- Fix invalid read when not using backup slurmctlds.
-- Prevent acct coordinators from changing default acct on add user.
-- Don't allow scontrol top do modify job priorities when priority == 1.
-- slurmsmwd - change parsing code to handle systems with the svid or inst
fields set in xtconsumer output.
-- Fix infinite loop in slurmctld if GRES is specified without a count.
-- sacct: Print error when unknown arguments are found.
-- Fix checking missing return codes when unpacking structures.
-- Fix slurm.spec-legacy including slurmsmwd
-- More explicit error message when cgroup oom-kill events detected.
-- When updating an association and are unable to find parent association
initialize old fairshare association pointer correctly.
-- Wrap slurm_cond_signal() calls with mutexes where needed.
-- Fix correct timeout with resends in slurm_send_only_node_msg.
-- Fix pam_slurm_adopt to honor action_adopt_failure.
-- Have the slurmd recreate the hwloc xml file for the full system on restart.
-- sdiag - correct the units for the gettimeofday() stat to microseconds.
-- Set SLURM_CLUSTER_NAME environment variable in MailProg to the ClusterName.
-- smail - use SLURM_CLUSTER_NAME environment variable.
-- job_submit/lua - expose argc/argv options through lua interface.
-- slurmdbd - prevent false-positive warning about innodb settings having
been set too low if they're actually set over 2GB.
* Changes in Slurm 18.08.0
==========================
-- Fix segfault on job arrays when starting controller without dbd up.
-- Fix pmi2 to build with gcc 8.0+.
-- Remove the development snapshot of select/cons_tres plugin.
-- Fix slurmd -C to not print benign error from xcpuinfo.
-- Fix potential double locks in the assoc_mgr.
-- Fix sacct truncate flag behavior Truncated pending jobs will always
return a start and end time set to the window end time, so elapsed
time is 0.
-- Fix extern step hanging forever when canceled right after creation.
-- sdiag - add slurmctld agent count.
-- Remove requirement to have cgroup_allowed_devices_file.conf in order to
constrain devices. By default all devices are allowed and GRES, that are
associated with a device file, that are not requested are restricted.
-- Fix proper alignment of clauses when determining if more nodes are needed
for an allocation.
-- Fix race condition when canceling a federation job that just started
running.
-- Prevent extra resources from being allocated when combining certain flags.
-- Fix problem in task/affinity plugin that can lead to slurmd fatal()'ing
when using --hint=nomultithread.
-- Fix left over socket file when step is ending and using pmi2 with
%n or %h in the spool dir.
-- Don't remove hwloc full system xml file when shutting down the slurmd.
-- Fix segfault that could happen with a het job when it was canceled while
starting.
-- Fix scan-build false-positive warning about invalid memory access in the
_ping_controller() function.
-- Add control_inx value to trigger_info_msg_t to permit future work in the
trigger management code to distinguish which of multiple backup controllers
has changed state.
* Changes in Slurm 18.08.0rc1
==============================
-- Add TimelimitRaw sacct output field to display timelimit numbers.
-- Fix job array preemption during backfill scheduling.
-- Fix scontrol -o show assoc output.
-- Add support for sacct --whole-hetjob=[yes|no] option.
-- Make salloc handle node requests the same as sbatch.
-- Add shutdown_on_reboot SlurmdParameter to control whether the Slurmd will
shutdown itself down or not when a reboot request is received.
-- Add cancel_reboot scontrol option to cancel pending reboot of nodes.
-- Make Users case insensitive in the database based on
Parameters=PreserveCaseUser in the slurmdbd.conf.
-- Improve scheduling when dealing with node_features that could have a
boot delay.
-- Fix issue if a step launch fails we don't get a bunch of '(null)' strings
in the step record for usage.
-- Changed the default AuthType for slurmdbd to auth/munge.
-- Make it so libpmi.so doesn't link to libslurm.so.$apiversion.
-- Added 'remote-fs.target' to After directive of slurmd.service file.
-- Fix filetxt plugin to handle it when you aren't running a jobacct_gather
plugin.
-- Remove drain on node when reboot nextstate used.
-- Speed up pack of job's qos.
-- Fix race condition when trying to update reservation in the database.
-- For the PrologFlags slurm.conf option, make NoHold mutually exclusive with
Contain and/or X11 options.
-- Revise the handling of SlurmctldSyslogLevel and SlurmdSyslogLevel options
in slurm.conf and DebugLevelSyslog in slurmdbd.conf.
-- Gate reading the cgroup.conf file.
-- Gate reading the acct_gather_* plugins.
-- Add sacctmgr options to prevent/manage job queue stuffing:
- GrpJobsAccrue=<max_jobs>
Maximum number of pending jobs in aggregate able to accrue age priority
for this association and all associations which are children of this
association. To clear a previously set value use the modify command with
a new value of -1.
- MaxJobsAccrue=<max_jobs>
Maximum number of pending jobs able to accrue age priority at any given
time for the given association. This is overridden if set directly on a
user. Default is the cluster's limit. To clear a previously set value use
the modify command with a new value of -1.
- MinPrioThreshold
Minimum priority required to reserve resources when scheduling.
* Changes in Slurm 18.08.0pre2
==============================
-- Remove support for "ChosLoc" configuration parameter.
-- Configuration parameters "ControlMachine", "ControlAddr", "BackupController"
and "BackupAddr" replaced by an ordered list of "SlurmctldHost" records
with the optional address appended to the name enclosed in parenthesis.
For example: "SlurmctldHost=head(12.34.56.78)". An arbitrary number of
backup servers can be configured.
-- When a pending job's state includes "UnavailableNodes" do not include the
nodes in FUTURE state.
-- Remove --immediate option from sbatch.
-- Add infrastructure for per-job and per-step TRES parameters: tres-per-job,
tres-per-node, tres-per-socket, tres-per-task, cpus-per-tres, mem-per-tres,
tres-bind and tres-freq. These new parameters are not currently used, but
have been added to the appropriate RPCs.
-- Add DefCpuPerGpu and DefMemPerGpu to global and per-partition configuration
parameters. Shown in scontrol/sview as "JobDefaults=...". NOTE: These
options are for future use and currently have no effect.
-- Fix for setting always the correct status on job update in mysql
-- Add ValidateMode configuration parameter to knl_cray.conf for static
MCDRAM/NUMA configurations.
-- Fix security issue in accounting_storage/mysql plugin by always escaping
strings within the slurmdbd. CVE-2018-7033.
-- Disable local PTY output processing when using 'srun --unbuffered'. This
prevents the PTY subsystem from inserting extraneous \r characters into
the output stream.
-- Change the column name for the %U (User ID) field in squeue to 'UID'.
-- CRAY - Add CheckGhalQuiesce to the CommunicationParameters.
-- When a process is core dumping, avoid terminating other processes in that
task group. This fixes a problem with writing out incomplete OpenMP core
files.
-- CPU frequency management enhancements: If scaling_available_frequencies
file is not available, then derive values from scaling_min_freq and
scaling_max_freq values. If cpuinfo_cur_freq file is not available then
try to use scaling_cur_freq.
-- Add pending jobs count to sdiag output.
-- Fix update job function. There were some incosistencies on the behavior
that caused time limits to be modified when swapping QOS, bad permissions
check for a coordinator and AllowQOS and DenyQOS were not enforced on
job update.
-- Add configuration paramerers SlurmctldPrimaryOnProg and
SlurmctldPrimaryOffProg, which define programs to execute when a slurmctld
daemon becomes the primary server or goes from primary to backup mode.
-- Add configuration paramerers SlurmctldAddr for use with virtual IP to manage
backup slurmctld daemons.
-- Explicitly shutdown the slurmd process when instructed to reboot.
-- Add ability to create/update partition with TRESBillingWeights through
scontrol.
-- Calcuate TRES billing values at submission so that billing limits can be
enforced at submission with QOS DenyOnLimit.
-- Add node_features plugin function "node_features_p_reboot_weight()" to
return the node weight to be used for a compute node that requires reboot
for use (e.g. to change the NUMA mode of a KNL node).
-- Add NodeRebootWeight parameter to knl.conf configuration file.
-- Fix insecure handling of job requested gid field. CVE-2018-10995.
-- Fix srun to return highest signal of any task.
-- Completely remove "gres" field from step record. Use "tres_per_node",
"tres_per_socket", etc.
-- Add "Links" parameter to gres.conf configuration file.
-- Force slurm_mktime() to set tm_isdst to -1 so anyone using the function
doesn't forget to set it.
-- burst_buffer.conf - Add SetExecHost flag to enable burst buffer access
from the login node for interactive jobs.
-- Append ", with requeued tasks" to job array "end" emails if any tasks in the
array were requeued. This is a hint to use "sacct --duplicates" to see the
whole picture of the array job.
-- Add ResumeFailProgram slurm.conf option to specify a program that is called
when a node fails to respond by ResumeTimeout.
-- Add new job pending reason of "ReqNodeNotAvail, reserved for maintenance".
-- Remove AdminComment += syntax from 'scontrol update job'.
-- sched/backfill: Reset job time limit if needed for deadline scheduling.
-- For heterogeneous job component with required nodes, explicitly exclude
those nodes from all other job components.
-- Add name of partition used to output of srun --test-only output (valuable
for jobs submitted to multiple partitions).
-- If MailProg is not configured and "/bin/mail" (the default) does not exist,
but "/usr/bin/mail" does exist then use "/usr/bin/mail" as a default value.
-- sdiag output now reports outgoing slurmctld message queue contents.
-- Fix issue in performance when reading slurm conf having nodes with features.
-- Make it so the slurmdbd's pid file gets created before initing
the database.
-- Improve escaping special characters on user commands when specifying paths.
-- Fix directory names with special char '\' that are not handled correctly.
-- Add salloc/sbatch/srun option of --gres-flags=disable-binding to disable
filtering of CPUs with respect to generic resource locality. This option is
currently required to use more CPUs than are bound to a GRES (i.e. if a GPU
is bound to the CPUs on one socket, but resources on more than one socket
are required to run the job). This option may permit a job to be allocated
resources sooner than otherwise possible, but may result in lower job
performance.
-- SlurmDBD - Print warning if MySQL/MariaDB internal tuning is not at least
half of the recommended values.
-- Move libpmi from src/api to contribs/pmi.
-- Add ability to specify a node reason when rebooting nodes with "scontrol
reboot".
-- Add nextstate option to "scontrol reboot" to dictate state of node after
reboot.
-- Consider "resuming" (nextstate=resume) nodes as available in backfill
future scheduling and don't replace "resuming" nodes in reservations.
-- Add the use of a xml file to help performance when using hwloc.
* Changes in Slurm 18.08.0pre1
==============================
-- Add new burst buffer state of "teardown-fail" to indicate the burst buffer
teardown operation is failing on specific buffers. This changes the numeric
value of the BB_STATE_COMPLETE type. Any Slurm version 17.02 or 17.11 tool
used to report burst buffer state information will report a state of "66"
rather than "complete" for burst buffers which have been deleted, but still
exist in the slurmctld daemon's tables (a very short-lived situation).
-- Multiple backup slurmctld daemons can be configured:
* Specify "BackupController#=<hostname> and "BackupAddr#=<address>" to
identify up to 9 backup servers.
* Output format of "scontrol ping" and the daemon status at the end of
"scontrol status" is modified to report up status of the primary and all
backup servers.
* "scontrol takeover [#]" command can now identify the SlurmctldHost
index number. Default value is "1" (the first backup configured
SlurmctldHost).
-- Enable jobs with zero node count for creation and/or deletion of persistent
burst buffers.
* The partition default MinNodes configuration parameter is now 0
(previously 1 node).
* Zero size jobs disabled for job arrays and heterogeneous jobs, but
supported for salloc, sbatch and srun commands.
-- Add "scontrol show dwstat" command to display Cray burst buffer status.
-- Add "GetSysStatus" option to burst_buffer.conf file. For burst_buffer/cray
this would indicate the location of the "dwstat" command.
-- Add node and partition configuration options of "CpuBind" to control default
task binding. Modify the scontrol to report and modify these parameters.
-- Add "NumaCpuBind" option to knl.conf file to automatically change a node's
CpuBind parameter based upon changes to a node's NUMA mode.
-- Add sbatch "--batch" option to identify features required on batch node.
For example "sbatch --batch=haswell ...".
-- Add "BatchFeatures" field to output of "scontrol show job".
-- Add support for "--bb" option to sbatch command.
-- Add new SystemComment field to job data structure and database. Currently
used for Burst Buffer error logs.
-- Expand reservation "flags" field from 32 to 64 bits.
-- Add job state flag of "SIGNALING" to avoid race condition with multiple
SIGSTOP/SIGCONT signals for the same job being active at the same time.
-- Properly handle srun --will-run option when there are jobs in COMPLETING
state.
-- Properly report who is signaling a step.
-- Don't combine updated reservation records in sreport's reservation report.
-- node_features plugin - Add suport for XOR & XAND of job constraints (node
feature specifications).
-- Add support for parenthesis in a job's constraint specification to group
like options together. For example
--constraint="[(knl&snc4&flat)*4&haswell*1]" might be used to specify that
four nodes with the features "knl", "snc4" and "flat" plus one node with
the feature "haswell" are required.
-- Improvements to how srun searches for the executible when using cwd.
-- Now programs can be checked before execution if test_exec is set when using
multi-prog option.
-- Report NodeFeatures plugin configuration with scontrol and sview commands.
-- Add acct_gather_profile/influxdb plugin.
-- Add new job state of SO/STAGE_OUT indicating that burst buffer stage-out
operation is in progress.
-- Correct SLURM_NTASKS and SLURM_NPROCS environment variable for heterogeneous
job step. Report values representing full allocation.
-- Expand advanced reservation feature specification to support parenthesis and
counts of nodes with specified features. Nodes with the feature currently
active will be prefered.
-- Defer job signaling until prolog is completed
-- Have the primary slurmctld wait until the backup has completely shutdown
before taking control.
-- Fix issue where unpacking job state after TRES count changed could lead to
invalid reads.
-- Heterogeneous job steps allocations supported with
* Open MPI (with Slurm's PMI2 and PMIx plugins) and
* Intel MPI (with Slurm's PMI2 plugin)
-- Remove redundant function arguments from task plugins:
* Remove "job_id" field from task_p_slurmd_batch_request() function.
* Remove "job_id" field from task_p_slurmd_launch_request() function.
* Remove "job_id" field from task_p_slurmd_reserve_resources() function.
-- Change function name from node_features_p_changible_feature() to
node_features_p_changeable_feature in node_features plugin.
-- Add Slurm configuration file check logic using "slurmctld -t" command.
* Changes in Slurm 17.11.10
===========================
-- Move priority_sort_part_tier from slurmctld to libslurm to make it possible
to run the regression tests 24.* without changing that code since it links
directly to the priority plugin where that function isn't defined.
-- Fix issue where job time limits can increase to max walltime when updating
a job with scontrol.
-- Fix invalid protocol_version manipulation on big endian platforms causing
srun and sattach to fail.
-- Fix for QOS, Reservation and Alias env variables in srun.
-- mpi/pmi2 - Backport 6a702158b49c4 from 18.08 to avoid dangerous detached
thread.
-- When allowing heterogeneous steps make sure we copy all the options to
avoid copying strings that may be overwritten.
-- Print correctly when sh5util finds and empty file.
-- Fix sh5util to not seg fault on exit.
-- Fix sh5util to check correctly for H5free_memory.
-- Adjust OOM monitoring function in task/cgroup to prevent problems in
regression suite from leaked file descriptors.
-- Fix issue with gres when defined with a type and no count
(i.e. gres=gpu/tesla) it would get a count of 0.
-- Allow sstat to talk to slurmd's that are new in protocol version.
-- Permit database names over 33 characters in accounting_storage/mysql.
-- Fix negative values when profiling.
-- Fix srun segfault caused by invalid memory reads on the env.
-- Fix segfault on job arrays when starting controller without dbd up.
-- Fix pmi2 to build with gcc 8.0+.
-- Fix proper alignment of clauses when determining if more nodes are needed
for an allocation.
-- Fix race condition when canceling a federation job that just started
running.
-- Prevent extra resources from being allocated when combining certain flags.
-- Fix problem in task/affinity plugin that can lead to slurmd fatal()'ing
when using --hint=nomultithread.
-- Fix left over socket file when step is ending and using pmi2 with
%n or %h in the spool dir.
-- Fix incorrect spacing for PartitionName lines in 'scontrol write config'.
-- Fix sacct to not print huge reserve times when the job was never eligible.
-- burst_buffer/cray - Add missing locks around assoc_mgr when timing out a
burst buffer.
-- burst_buffer/cray - Update burst buffers when an association or qos
is removed from the system.
-- If failed over to a backup controller, ensure the agent thread is launched
to handle deferred tasks.
-- Fix correct job CPU count allocated.
-- Protect against sending to the slurmdbd if the connection has gone away.
-- Fix checking missing return codes when unpacking structures.
-- Fix slurm.spec-legacy including slurmsmwd
-- More explicit error message when cgroup oom-kill events detected.
-- When updating an association and are unable to find parent association
initialize old fairshare association pointer correctly.
-- Wrap slurm_cond_signal() calls with mutexes where needed.
-- Fix correct timeout with resends in slurm_send_only_node_msg.
-- Fix pam_slurm_adopt to honor action_adopt_failure.
-- job_submit/lua - expose argc/argv options through lua interface.
* Changes in Slurm 17.11.9-2
===========================
-- Fix printing of node state "drain + reboot" (and other node state flags).
-- Fix invalid read (segfault) when sorting multi-partition jobs.
-- Move several new error() messages to debug() to keep them out of users'
srun output.
* Changes in Slurm 17.11.9
==========================
-- Fix segfault in slurmctld when a job's node bitmap is NULL during a
scheduling cycle. Primarily caused by EnforcePartLimits=ALL.
-- Remove erroneous unlock in acct_gather_energy/ipmi.
-- Enable support for hwloc version 2.0.1.
-- Fix 'srun -q' (--qos) option handling.
-- Fix socket communication issue that can lead to lost task completition
messages, which will cause a permanently stuck srun process.
-- Handle creation of TMPDIR if environment variable is set or changed in
a task prolog script.
-- Avoid node layout fragmentation if running with a fixed CPU count but
without Sockets and CoresPerSocket defined.
-- burst_buffer/cray - Fix datawarp swap default pool overriding jobdw.
-- Fix incorrect job priority assignment for multi-partition job with
different PriorityTier settings on the partitions.
-- Fix sinfo to print correct node state.
* Changes in Slurm 17.11.8
==========================
-- Fix incomplete RESPONSE_[RESOURCE|JOB_PACK]_ALLOCATION building path.
-- Do not allocate nodes that were marked down due to the node not responding
by ResumeTimeout.
-- task/cray plugin - search for "mems" cgroup information in the file
"cpuset.mems" then fall back to the file "mems".
-- Fix ipmi profile debug uninitialized variable.
-- Improve detection of Lua package on older RHEL distributions.
-- PMIx: fixed the direct connect inline msg sending.
-- MYSQL: Fix issue not handling all fields when loading an archive dump.
-- Allow a job_submit plugin to change the admin_comment field during
job_submit_plugin_modify().
-- job_submit/lua - fix access into reservation table.
-- MySQL - Prevent deadlock caused by archive logic locking reads.
-- Don't enforce MaxQueryTimeRange when requesting specific jobs.
-- Modify --test-only logic to properly support jobs submitted to more than
one partition.
-- Prevent slurmctld from abort when attempting to set non-existing
qos as def_qos_id.
-- Add new job dependency type of "afterburstbuffer". The pending job will be
delayed until the first job completes execution and it's burst buffer
stage-out is completed.
-- Reorder proctrack/task plugin load in the slurmstepd to match that of slurmd
and avoid race condition calling task before proctrack can introduce.
-- Prevent reboot of a busy KNL node when requesting inactive features.
-- Revert to previous behavior when requesting memory per cpu/node introduced
in 17.11.7.
-- Fix to reinitialize previously adjusted job members to their original value
when validating the job memory in multi-partition requests.
-- Fix _step_signal() from always returning SLURM_SUCCESS.
-- Combine active and available node feature change logs on one line rather
than one line per node for performance reasons.
-- Prevent occasionally leaking freezer cgroups.
-- Fix potential segfault when closing the mpi/pmi2 plugin.
-- Fix issues with --exclusive=[user|mcs] to work correctly
with preemption or when job requests a specific list of hosts.
-- Make code compile with hdf5 1.10.2+
-- mpi/pmix: Fixed the collectives canceling.
-- SlurmDBD: improve error message handling on archive load failure.
-- Fix incorrect locking when deleting reservations.
-- Fix incorrect locking when setting up the power save module.
-- Fix setting format output length for squeue when showing array jobs.
-- Add xstrstr function.
-- Fix printing out of --hint options in sbatch, salloc --help.
-- Prevent possible divide by zero in _validate_time_limit().
-- Add Delegate=yes to the slurmd.service file to prevent systemd from
interfering with the jobs' cgroup hierarchies.
-- Change the backlog argument to the listen() syscall within srun to 4096
to match elsewhere in the code, and avoid communication problems at scale.
* Changes in Slurm 17.11.7
==========================
-- Fix for possible slurmctld daemon abort with NULL pointer.
-- Fix different issues when requesting memory per cpu/node.
-- PMIx - override default paths at configure time if --with-pmix is used.
-- Have sprio display jobs before eligible time when
PriorityFlags=ACCRUE_ALWAYS is set.
-- Make sure locks are always in place when calling _post_qos_list().
-- Notify srun and ctld when unkillable stepd exits.
-- Fix slurmstepd deadlock in stepd cleanup caused by race condition in
the jobacct_gather fini() interfaces introduced in 17.11.6.
-- Fix slurmstepd deadlock in PMIx startup.
-- task/cgroup - fix invalid free() if the hwloc library does not return a
string as expected.
-- Fix insecure handling of job requested gid field. CVE-2018-10995.
* Changes in Slurm 17.11.6
==========================
-- CRAY - Add slurmsmwd to the contribs/cray dir.
-- sview - fix crash when closing any search dialog.
-- Fix initialization of variable in stepd when using native x11.
-- Fix reading slurm_io_init_msg to handle partial messages.
-- Fix scontrol create res segfault when wrong user/account parameters given.
-- Fix documentation for sacct on parameter -X (--allocations)
-- Change TRES Weights debug messages to debug3.
-- FreeBSD - assorted fixes to restore build.
-- Fix for not tracking environment variables from unrelated different jobs.
-- PMIX - Added the direct connect authentication.
When upgrading this may cause issues with jobs using pmix starting on mixed
slurmstepd versions where some are less than 17.11.6.
-- Prevent the backup slurmctld from losing the active/available node
features list on takeover.
-- Add documentation for fix IDLE*+POWER due to capmc stuck in Cray systems.
-- Fix missing mutex unlock when prolog is failing on a node, leading to a
hung slurmd.
-- Fix locking around Cray CCM prolog/epilog.
-- Add missing fed_mgr read locks.
-- Fix issue incorrectly setting a job time_start to 0 while requeueing.
-- smail - remove stray '-s' from mail subject line.
-- srun - prevent segfault if ClusterName setting is unset but
SLURM_WORKING_CLUSTER environment variable is defined.
-- In configurator.html web pages change default configuration from
task/none to task/affinity plugin and from select/linear plugin to
select/cons_res plus CR_Core.
-- Allow jobs to run beyond a FLEX reservation end time.
-- Fix problem with wrongly set as Reservation job state_reason.
-- Prevent bit_ffs() from returnig value out of bitmap range.
-- Improve performance of 'squeue -u' when PrivateData=jobs is enabled.
-- Make UnavailableNodes value in job reason be correct for each job.
-- Fix 'squeue -o %s' on Cray systems.
-- Fix incorrect error thrown when cancelling part of a job array.
-- Fix error code and scheduling problem for --exclusive=[user|mcs].
-- Fix build when lz4 is in a non-standard location.
-- Be able to force power_down of cloud node even if in power_save state.
-- Allow cloud nodes to be recognized in Slurm when booted out of band.
-- Fixes race condition in _pack_job_gres() when is called multiple times.
-- Increase duration of "sleep" command used to keep extern step alive.
-- Remove unsafe usage of pthread_cancel in slurmstepd that can lead to
to deadlock in glibc.
-- Fix total TRES Billing on partitions.
-- Don't tear down a BB if a node fails and --no-kill or resize of a job
happens.
-- Remove unsafe usage of pthread_cancel in pmix plugin that can lead to
to deadlock in glibc.
-- Fix fatal in controller when loading completed trigger
-- Ignore reservation overlap at submission time.
-- GRES type model and QOS limits documentation added
-- slurmd - fix ABRT on SIGINT after reconfigure with MemSpecLimit set.
-- PMIx - move two error messages on retry to debug level, and only display
the error after the retry count has been exceeded.
-- Increase number of tries when sending responses to srun.
-- Fix checkpointing requeued/completing jobs in a bad state which caused a
segfault on restart.
-- Fix srun on ppc64 platforms.
-- Prevent slurmd from starting steps if the Prolog returns an error when using
PrologFlags=alloc.
-- priority/multifactor - prevent segfault running sprio if a partition has
just been deleted and PriorityFlags=CALCULATE_RUNNING is turned on.
-- job_submit/lua - add ESLURM_INVALID_TIME_LIMIT return code value.
-- job_submit/lua - print an error if the script calls log.user in
job_modify() instead of returning it to the next submitted job erroneously.
-- select/linear - handle job resize correctly.
-- select/cons_res - improve handling of --cores-per-socket requests.
* Changes in Slurm 17.11.5
==========================
-- Fix cloud nodes getting stuck in DOWN+POWER_UP+NO_RESPOND state after not
responding by ResumeTimeout.
-- Add job's array_task_cnt and user_name along with partitions
[max|def]_mem_per_[cpu|node], max_cpus_per_node, and max_share with the
SHARED_FORCE definition to the job_submit/lua plugin.
-- srun - fix for SLURM_JOB_NUM_NODES env variable assignment.
-- sacctmgr - fix runaway jobs identification.
-- Fix for setting always the correct status on job update in mysql.
-- Fix issue if running with an association manager cache (slurmdbd was down
when slurmctld was started) you could loose QOS usage information.
-- CRAY - Fix spec file to work correctly.
-- Set scontrol exit code to 1 if attempting to update a node state to DRAIN
or DOWN without specifying a reason.
-- Fix race condition when running with an association manager cache
(slurmdbd was down when slurmctld was started).
-- Print out missing SLURM_PERSIST_INIT slurmdbd message type.
-- Fix two build errors related to use of the O_CLOEXEC flag with older glibc.
-- Add Google Cloud Platform integration scripts into contribs directory.
-- Fix minor potential memory leak in backfill plugin.
-- Add missing node flags (maint/power/etc) to node states.
-- Fix issue where job time limits may end up at 1 minute when using the
NoReserve flag on their QOS.
-- Fix security issue in accounting_storage/mysql plugin by always escaping
strings within the slurmdbd. CVE-2018-7033.
-- Soften messages about best_fit topology to debug2 to avoid alarm.
-- Fix issue in sreport reservation utilization report to handle more
allocated time than 100% (Flex reservations).
-- When a job is requesting a Flex reservation prefer the reservation's nodes
over any other nodes.
* Changes in Slurm 17.11.4
==========================
-- Add fatal_abort() function to be able to get core dumps if we hit an
"impossible" edge case.
-- Link slurmd against all libraries that slurmstepd links to.
-- Fix limits enforce order when they're set at partition and other levels.
-- Add slurm_load_single_node() function to the Perl API.
-- slurm.spec - change dependency for --with lua to use pkgconfig.
-- Fix small memory leaks in node_features plugins on reconfigure.
-- slurmdbd - only permit requests to update resources from operators or
administrators.
-- Fix handling of partial writes in io_init_msg_write_to_fd() which can
lead to job step launch failure under higher cluster loads.
-- MYSQL - Fix to handle quotes in a given work_dir of a job.
-- sbcast - fix a race condition that leads to "Unspecified error".
-- Log that support for the ChosLoc configuration parameter will end in Slurm
version 18.08.
-- Fix backfill performance issue where bf_min_prio_reserve was not respected.
-- Fix MaxQueryTimeRange checks.
-- Print MaxQueryTimeRange in "sacctmgr show config".
-- Correctly check return codes when creating a step to check if needing to
wait to retry or not.
-- Fix issue where a job could be denied by Reason=MaxMemPerLimit when not
requesting any tasks.
-- In perl tools, fix for regexp that caused extra incorrectly shown results.
-- Add some extra locks in fed_mgr to be extra safe.
-- Minor memory leak fixes in the fed_mgr on slurmctld shutdown.
-- Make sreport job reports also report duplicate jobs correctly.
-- Fix issues restoring certain Partition configuration elements, especially
when ReconfigFlags=KeepPartInfo is enabled.
-- Don't add TRES whose value is NO_VAL64 when building string line.
-- Fix removing array jobs from hash in slurmctld.
-- Print out missing user messages from jobsubmit plugin when srun/salloc are
waiting for an allocation.
-- Handle --clusters=all as case insensitive.
-- Only check requested clusters in federation when using --test-only
submission option.
-- In the federation, make it so you can cancel stranded sibling jobs.
-- Silence an error from PSS memory stat collection process.
-- Requeue jobs allocated to nodes requested to DRAIN or FAIL if nodes are
POWER_SAVE or POWER_UP, preventing jobs to start on NHC-failed nodes.
-- Make MAINT and OVERLAP resvervation flags order agnostic on overlap test.
-- Preserve node features when slurmctld daemons reconfigured including active
and available KNL features.
-- Prevent creation of multiple io_timeout threads within srun, which can
lead to fatal() messages when those unexpected and additional mutexes are
destroyed when srun shuts down.
-- burst_buffer/cray - Prevent use of "#DW create_persistent" and
"#DW destroy_persistent" directives available in Cray CLE6.0UP06. This
will be supported in Slurm version 18.08. Use "#BB" directives until then.
-- Fix task/cgroup affinity to behave correctly.
-- FreeBSD - fix build on systems built with WITHOUT_KERBEROS.
-- Fix to restore pn_min_memory calculated result to correctly enforce
MaxMemPerCPU setting on a partition when the job uses --mem.
-- slurmdbd - prevent infinite loop if a QOS is set to preempt itself.
-- Fix issue with log rotation for slurmstepd processes.
* Changes in Slurm 17.11.3-2
==========================
-- Revert node_features changes in 17.11.3 that lead to various segfaults on
slurmctld startup.
* Changes in Slurm 17.11.3
==========================
-- Send SIG_UME correctly to a step.
-- Sort sreport's reservation report by cluster, time_start, resv_name instead
of cluster, resv_name, time_start.
-- Avoid setting node in COMPLETING state indefinitely if the job initiating
the node reboot is cancelled while the reboot in in progress.
-- Scheduling fix for changing node features without any NodeFeatures plugins.
-- Improve logic when summarizing job arrays mail notifications.
-- Add scontrol -F/--future option to display nodes in FUTURE state.
-- Fix REASONABLE_BUF_SIZE to actually be 3/4 of MAX_BUF_SIZE.
-- When a job array is preempting make it so tasks in the array don't wait
to preempt other possible jobs.
-- Change free_buffer to FREE_NULL_BUFFER to prevent possible double free
in slurmstepd.
-- node_feature/knl_cray - Fix memory leaks that occur when slurmctld
reconfigured.
-- node_feature/knl_cray - Fix memory leak that can occur during normal
operation.
-- Fix srun environment variables for --prolog script.
-- Fix job array dependency with "aftercorr" option and some task arrays in
the first job fail. This fix lets all task array elements that can run
proceed rather than stopping all subsequent task array elements.
-- Fix potential deadlock in the slurmctld when using list_for_each.
-- Fix for possible memory corruption in srun when running heterogeneous job
steps.
-- Fix job array dependency with "aftercorr" option and some task arrays in
the first job fail. This fix lets all task array elements that can run
proceed rather than stopping all subsequent task array elements.
-- Fix output file containing "%t" (task ID) for heterogeneous job step to
be based upon global task ID rather than task ID for that component of the
heterogeneous job step.
-- MYSQL - Fix potential abort when attempting to make an account a parent of
itself.
-- Fix potentially uninitialized variable in slurmctld.
-- MYSQL - Fix issue for multi-dimensional machines when using sacct to
find jobs that ran on specific nodes.
-- Reject --acctg-freq at submit if invalid.
-- Added info string on sh5util when deleting an empty file.
-- Correct dragonfly topology support when job allocation specifies desired
switch count.
-- Fix minor memory leak on an sbcast error path.
-- Fix issues when starting the backup slurmdbd.
-- Revert uid check when requesting a jobid from a pid.
-- task/cgroup - add support to detect OOM_KILL cgroup events.
-- Fix whole node allocation cpu counts when --hint=nomultihtread.
-- Allow execution of task prolog/epilog when uid has access
rights by a secondary group id.
-- Validate command existence on the srun *[pro|epi]log options
if LaunchParameter test_exec is set.
-- Fix potential memory leak if clean starting and the TRES didn't change
from when last started.
-- Fix for association MaxWall enforcement when none is given at submission.
-- Add a job's allocated licenses to the [Pro|Epi]logSlurmctld.
-- burst_buffer/cray: Attempts by job to create persistent burst buffer when
one already exists owned by a different user will be logged and the job
held.
-- CRAY - Remove race in the core_spec where we add the slurmstepd to the
job container where if the step was canceled would also cancel the stepd
erroneously.
-- Make sure the slurmstepd blocks signals like SIGTERM correctly.
-- SPANK - When slurm_spank_init_post_opt() fails return error correctly.
-- When revoking a sibling job in the federation we want to send a start
message before purging the job record to get the uid of the revoked job.
-- Make JobAcctGatherParams options case-insensitive. Previously, UsePss
was the only correct capitialization; UsePSS or usepss were silently
ignored.
-- Prevent pthread_atfork handlers from being added unnecessarily after
'scontrol reconfigure', which can eventually lead to a crash if too
many handlers have been registered.
-- Better debug messages when MaxSubmitJobs is hit.
-- Docs - update squeue man page to describe all possible job states.
-- Prevent orphaned step_extern steps when a job is cancelled while the
prolog is still running.
* Changes in Slurm 17.11.2
==========================
-- jobcomp/elasticsearch - append Content-Type to the HTTP header.
-- MYSQL - Fix potential abort of slurmdbd when job has no TRES.
-- Add advanced reservation flag of "REPLACE_DOWN" to replace DOWN or DRAINED
nodes.
-- slurm.spec-legacy - add missing libslurmfull.so to slurm.files.
-- Fix squeue job ID filtering for pending job array records.
-- Fix potential deadlock in _run_prog() in power save code.
-- MYSQL - Add dynamic_offset in the database to force range for auto
increment ids for the tres_table.
-- MYSQL - Fix fallout from MySQL auto increment bug, see RELEASE_NOTES,
only affects current 17.11 users tracking licenses or GRES in the database.
-- Refactor logging logic to avoid possible memory corruption on non-x86
architectures.
-- Fix memory leak when getting jobs from the slurmdbd.
-- Fix incorrect logic behind MemorySwappiness, and only set the value when
specified in the configuration.
* Changes in Slurm 17.11.1-2
============================
-- MYSQL - Make index for pack_job_id
* Changes in Slurm 17.11.1
==========================
-- Fix --with-shared-libslurm option to work correctly.
-- Make it so only daemons log errors on configuration option duplicates.
-- Fix for ConstrainDevices=yes to work correctly.
-- Fix to purge old jobs using burst buffer if slurmctld daemon restarted
after the job's burst buffer work was already completed.
-- Make logging prefix for slurmstepd to happen as soon as possible.
-- mpi/pmix: Fix the job registration for the PMIx v2.1.
-- Fix uid check for signaling a step with anything but SIGKILL.
-- Return ESLURM_TRANSITION_STATE_NO_UPDATE instead of EAGAIN when trying to
signal a step that is still running a prolog.
-- Update Cray slurm_playbook.yaml with latest recommended version.
-- Only say a prolog is done running after the extern step is launched.
-- Wait to start a batch step until the prolog and extern step are
fully ran/launched. Only matters if running with
PrologFlags=[contain|alloc].
-- Truncate a range for SlurmctldPort to FD_SETSIZE elements and throw an
error, otherwise network traffic may be lost due to poll() not detecting
traffic.
-- Fix for srun --pack-group option that can reuse/corrupt memory.
-- Fix handling ultra long hostlists in a hostfile.
-- X11: fix xauth regex to handle '-' in hostnames again.
-- Fix potential node reboot timeout problem for "scontrol reboot" command.
-- Add ability for squeue to sort jobs by submit time.
-- CRAY - Switch to standard pid files on Cray systems.
-- Update jobcomp records on duplicate inserts.
-- If unrecognized configuration file option found then print an appropriate
fatal error message rather than relying upon random errno value.
-- Initialize job_desc_msg_t's instead of just memset'ing them.
-- Fix divide by zero when job requests no tasks and more memory than
MaxMemPer{CPU|NODE}.
-- Avoid changing Slurm internal errno on syslog() failures.
-- BB - Only launch dependent jobs after the burst buffer is staged-out
completely instead of right after the parent job finishes.
-- node_features/knl_generic - If plugin can not fully load then do not spawn
a background pthread (which will fail with invalid memory reference).
-- Don't set the next jobid to give out to the highest jobid in the system on
controller startup. Just use the checkpointed next use jobid.
-- Docs - add Slurm/PMIx and OpenMPI build notes to the mpi_guide page.
-- Add lustre_no_flush option to LaunchParameters for Native Cray systems.
-- Fix rpmbuild issue with rpm 4.13+ / Fedora 25+.
-- sacct - fix the display for the NNodes field when using the --units option.
-- Prevent possible double-xfree on a buffer in stepd_completion.
-- Fix for record job state on successful allocation but failed reply message.
-- Fill in the user_name field for batch jobs if not sent by the slurmctld.
(Which is the default behavior if PrologFlags=send_gids is not enabled.)
This prevents job launch problems for sites using UsePAM=1.
-- Handle syncing federated jobs that ran on non-origin clusters and were
cancelled while the origin cluster was down.
-- Fix accessing variable outside of lock.
-- slurm.spec: move libpmi to a separate package to solve a conflict with the
version provided by PMIx. This will require a separate change to PMIx as
well.
-- X11 forwarding: change xauth handling to use hostname/unix:display format,
rather than localhost:display.
-- mpi/pmix - Fix warning if not compiling with debug.
* Changes in Slurm 17.11.0
==========================
-- Fix documentation for MaxQueryTimeRange option in slurmdbd.conf.
-- Avoid srun abort trying to run on heterogeneous job component that has
ended.
-- Add SLURM_PACK_JOB_ID,SLURM_PACK_JOB_OFFSET to PrologSlurmctld and
EpilogSlurmctld environment.
-- Treat ":" in #SBATCH arguments as fatal error. The "#SBATCH packjob" syntax
must be used instead.
-- job_submit/lua plugin: expose pack_job fields to get.
-- Prevent scheduling deadlock with multiple components of heterogeneous job
in different partitions (i.e. one heterogeneous job component is higher
priority in one partition and another component is lower priority in a
different partition).
-- Fix for heterogeneous job starvation bug.
-- Fix some slurmctld memory leaks.
-- Add SLURM_PACK_JOB_NODELIST to PrologSlurmctld and EpilogSlurmctld
environment.
-- If PrologSlurmctld fails for pack job leader then requeue or kill all
components of the job.
-- Fix for mulitple --pack-group srun arguments given out of order.
-- Update slurm.conf(5) man page with updated example logrotate script.
-- Add SchedulerParameters=whole_pack configuration parameter. If set, then
hold, release and cancel operations on any component of a heterogeneous job
will be applied to all components
-- Handle FQDNs in xauth cookies for x11 display forwarding properly.
-- For heterogeneous job steps, the srun --open-mode option default value will
be set to "append".
-- Pack job scheduling list not being cleared between runs of the backfill
scheduler resulted in various anomalies.
-- Fix that backward compat for pmix version < 1.1.5.
-- Fix use-after-free that can lead to slurmstepd segfaulting when setting
ulimit values.
-- Add heterogeneous job start data to sdiag output.
-- X11 forwarding - handle systems with X11UseLocalhost=no set in sshd_config.
-- Fix potential missing issue with missin symbols in gres plugins.
-- Ignore querying clusters in federation that are down from status commands.
-- Base federated jobs off of origin job and not the local cluster in API.
-- Remove erroneous double '-' on rpath for libslurmfull.
-- Remove version from libslurmfull and move it to $LIBDIR/slurm since the ABI
could change from one version to the other.
-- Fix unused wall time for reservations.
-- Convert old reservation records to insert unused wall into the rows.
-- slurm.spec: further restructing and improvements.
-- Allow nodes state to be updated between FAIL and DRAIN.
-- x11 forwarding: handle build with alternate location for libssh2.
* Changes in Slurm 17.11.0rc3
==============================
-- Fix extern step to wait until launched before allowing job to start.
-- Add missing locks around figuring out TRES when clean starting the
slurmctld.
-- Cray modulefile: avoid removing /usr/bin from path on module unload.
-- Make reoccurring reservations show up in the database.
-- Adjust related resources (cpus, tasks, gres, mem, etc.) when updating
NumNodes with scontrol.
-- Don't initialize MPI plugins for batch or extern steps.`
-- slurm.spec - do not install a slurm.conf file under /etc/ld.so.conf.d.
-- X11 forwarding - fix keepalive message generation code.
-- If heterogeneous job step is unable to acquire MPI reserved ports then
avoid referencing NULL pointer. Retry assigning ports ONLY for
non-heterogeneous job steps.
-- If any acct_gather_*_init fails fatal instead of error and keep going.
-- launch/slurm plugin - Avoid using global variable for heterogeneous job
steps, which could corrupt memory.
* Changes in Slurm 17.11.0rc2
==============================
-- Prevent slurmctld abort with NodeFeatures=knl_cray and non-KNL nodes lacking
any configured features.
-- The --cpu_bind and --mem_bind options have been renamed to --cpu-bind
and --mem-bind for consistency with the rest of Slurm's options. Both
old and new syntaxes are supported for now.
-- Add slurmdb_connection_commit to the slurmdb api to commit when needed.
-- Add the federation api's to the slurmdb.h file.
-- Add job functions to the db_api.
-- Fix sacct to always use the db_api instead of sometimes calling functions
directly.
-- Fix sacctmgr to always use the db_api instead of sometimes calling functions
directly.
-- Fix sreport to always use the db_api instead of sometimes calling functions
directly.
-- Make global uid to the db_api to minimize calls to getuid().
-- Add support for HWLOC version 2.0.
-- Added more validation logic for updates to node features.
-- Added node_features_p_node_update_valid() function to node_features plugin.
-- If a job is held due to bad constraints and a node's features change then
test the job again to see if can run with the new features.
-- Added node_features_p_changible_feature() function to node_features plugin.
-- Avoid rebooting a node if a job's requested feature is not under the control
of the node_features plugin and is not currently active.
-- node_features/knl_generic plugin: Do not clear a node's non-KNL features
specified in slurm.conf.
-- Added SchedulerParameters configuration option "disable_hetero_steps" to
disable job steps that span multiple components of a heterogeneous job.
Disabled by default except with mpi/none plugin. This limitation to be
removed in Slurm version 18.08.
* Changes in Slurm 17.11.0rc1
==============================
-- Added the following jobcomp/script environment variables: CLUSTER,
DEPENDENCY, DERIVED_EC, EXITCODE, GROUPNAME, QOS, RESERVATION, USERNAME.
The format of LIMIT (job time limit) has been modified to D-HH:MM:SS.
-- Fix QOS usage factor applying to individual TRES run minute usage.
-- Print numbers using exponential format if required to fit in allocated
field width. The sacctmgr and sshare commands are impacted.
-- Make it so a backup DBD doesn't attempt to create database tables and
relies on the primary to do so.
-- By default have Slurm dynamically link to libslurm.so instead of static
linking. If static linking is desired configure with
--without-shared-libslurm.
-- Change --workdir in sbatch to be --chdir as in all other commands (salloc,
srun).
-- Add WorkDir to the job record in the database.
-- Make the UsageFactor of a QOS work when a qos has the nodecay flag.
-- Add MaxQueryTimeRange option to slurmdbd.conf to limit accounting query
ranges when fetching job records.
-- Add LaunchParameters=batch_step_set_cpu_freq to allow the setting of the cpu
frequency on the batch step.
-- CRAY - Fix statically linked applications to CRAY's PMI.
-- Fix - Raise an error back to the user when trying to update currently
unsupported core-based reservations.
-- Do not print TmpDisk space as part of 'slurmd -C' line.
-- Fix to test MaxMemPerCPU/Node partition limits when scheduling, previously
only checked on submit.
-- Work for heterogeneous job support (complete solution in v17.11):
* Set SLURM_PROCID environment variable to reflect global task rank (needed
by MPI).
* Set SLURM_NTASKS environment variable to reflect global task count (needed
by MPI).
* In srun, if only some steps are allocated and one step allocation fails,
then delete all allocated steps.
* Get SPANK plungins working with heterogeneous jobs. The
spank_init_post_opt() function is executed once per job component.
* Modify sbcast command and srun's --bcast option to support heterogeneous
jobs.
* Set more environment variables for MPI: SLURM_GTIDS and SLURM_NODEID.
* Prevent a heterogeneous job allocation from including the same nodes in
multiple components (required by MPI jobs spanning components).
* Modify step create logic so that call components of a heterogeneous job
launched by a single srun command have the same step ID value.
-- Modify output of "--mpi=list" to avoid duplicates for version numbers in
mpi/pmix plugin names.
-- Allow nodes to be rebooted while in a maintenance reservation.
-- Show nodes as down even when nodes are in a maintenance reservation.
-- Harden the slurmctld HA stack to mitigate certain split-brain issues.
-- Work for heterogeneous job support (complete solution in v17.11):
* Add burst buffer support.
* Remove srun's --mpi-combine option (always combined).
* Add SchedulerParameters configuration option "enable_hetero_steps" to
enable job steps that span multiple components of a heterogeneous job.
Disabled by default as most MPI implementations and Slurm configurations
are not currently supported. Limitation to be removed in Slurm version
18.08.
* Synchronize application launch across multiple components with debugger.
* Modify slurm_kill_job_step() to cancel all components of a heterogeneous
job step (used by MPI).
* Set SLURM_JOB_NUM_NODES environment variable as needed by MVAPICH.
* Base time limit upon the time that the latest job component is available
(after all nodes in all components booted and ready for use).
-- Add cluster name to smail tool email header.
-- Speedup arbitrary distribution algorithm.
-- Modify "srun --mpi=list" output to match valid option input by removing the
"mpi/" prefix on each line of output.
-- Automatically set the reservation's partition for the job if not the
cluster default.
-- mpi/pmi2 plugin - vestigial pointer could be referenced at shutdown with
invalid memory reference resulting.
-- Fix to _is_gres_cnt_zero() return false for improper input string
-- Cleanup all pthread_create calls and replace with new slurm_thread_create
macro.
-- Removed obsolete MPI plugins. Remaining options are openmpi, pmi2, pmix.
-- Removed obsolete checkpoint/poe plugin.
-- Process spank environment variable options before processing spank command
line options. Spank plugins should be able to handle option callbacks being
called multiple times.
-- Add support for specialized cores with task/affinity plugin (previously
only supported with task/cgroup plugin).
-- Add "TaskPluginParam=SlurmdOffSpec" option that will prevent the Slurm
compute node daemons (slurmd and slurmstepd) from executing on specialized
cores.
-- CRAY - Make native mode default, use --disable-native-cray to use ALPS
instead of native Slurm.
-- Add ability to prevent suspension of some count of nodes in a specified
range using the SuspendExcNodes configuration parameter.
-- Add SLURM_WCKEY to PrologSlurmctld and EpilogSlurmctld environment.
-- Return user response string in response to successful job allocation request
not only on failure. Set in LUA using function 'slurm.user_msg("STRING")'.
-- Add 'scontrol write batch_script <jobid>' command to retrieve the batch
script for a given job.
-- Remove option to display the batch script as part of 'scontrol show job'.
-- On native Cray system the configured RebootProgram is executed on on the
head node by the slurmctld daemon rather than by the slurmd daemons on the
compute nodes. The "capmc_resume" program from "contribs/cray" can be used.
-- Modify "scontrol top" command to accept a comma separated list of job IDs
as an argument rather than a single job ID.
-- Add MemorySwappiness value to cgroup.conf.
-- Add new "billing" TRES which allows jobs to be limited based on the job's
billable TRES calculated by the job's partition's TRESBillingWeights.
-- sbatch - force line-buffered output so 'sbatch -W' returns the jobid
over a piped output immediately.
-- Regular user use of "scontrol top" command is now diabled. Use the
configuration parameter "SchedulerParameters=enable_user_top" to enable
that functionality. The configuration parameter
"SchedulerParameters=disable_user_top" will be silently ignored.
-- Add -TALL to sreport.
-- Removed unused SlurmdPlugstack option and associated framework.
-- Correct logic for line continuation in srun --multi-prog file.
-- Add DBD Agent queue size to sdiag output.
-- Add running job count to sdiag output.
-- Print unix timestamps next to ASCII timestamps in sdiag output.
-- In a job allocation spanning KNL and non-KNL nodes and requiring a reboot,
do not attempt to set default NUMA or MCDRAM modes on non-KNL nodes.
-- Change default to let pending jobs run outside of reservation after
reservation is gone to put jobs in held state. Added NO_HOLD_JOBS_AFTER_END
reservation flag to use old default.
-- When creating a reservation, validate the CoreCnt specification matches
the number of nodes listed.
-- When creating a reservation, correct logic to ignoring job allocations on
request.
-- Deprecate BLCR plugin, and do not build by default.
-- Change sreport report titles from "Use" to "Usage"
* Changes in Slurm 17.11.0pre2
==============================
-- Initial work for heterogeneous job support (complete solution in v17.11):
* Modified salloc, sbatch and srun commands to parse command line, job
script and environment variables to recognize requests for heterogeneous
jobs. Same commands also modified to set environment variables describing
each component of the heterogeneous job.
* Modified job allocate, batch job submit and job "will-run" requests to
pass a list of job specifications and get a list of responses.
* Modify slurmctld daemon to process a heterogeneous job request and create
multiple job records as needed.
* Added new fields to job record: pack_job_id, pack_job_offset and
pack_job_set (set of job IDs). Added to slurmctld state save/restore
logic and job information reported.
* Display new job fields in "scontrol show job" output.
* Modify squeue command to display heterogeneous job records using "#+#"
format. The squeue --job=# output lists all components of a heterogeneous
job.
* Modify scancel logic to cancel all components of a heterogeneous job with
a single request/RPC.
* Configuration parameter DebugFlags value of "HeteroJobs" added.
* Job requeue and suspend/resume modified to operate on all components of
a heterogeneous job with a single request/RPC.
* New web page added to describe heterogeneous jobs.
* Descriptions of new API added to man pages.
* Modified email notifications to only operate on the first job component.
* Purge heterogeneous job records at the same time and not by individual
components.
* Modified logic for heterogeneous jobs submitted to multiple clusters
("--clusters=...") so the job will be routed to the cluster that is
expected to start all components earliest.
* Modified srun to create multiple job steps for heterogeneous job
allocations.
* Modified launch plugin to accept a pointer to job step options structure
rather than work from a single/common data structure.
-- Improve backfill scheduling algorithm with respect to starting jobs as soon
as possible while avoiding advanced reservations.
-- Add URG as an option to 'scancel --signal'.
-- Check if the buffer returned from slurm_persist_msg_pack() isn't NULL.
-- Modify all daemons to re-open log files on receipt of SIGUSR2 signal. This
is much than using SIGHUP to re-read the configuration file and rebuild
various tables.
-- Add PrivateData=events configuration parameter
-- Work for heterogeneous job support (complete solution in v17.11):
* Add pointer to job option structure to job_step_create_allocation()
function used by srun.
* Parallelize task launch for heterogeneous job allocations (initial work).
* Make packjobid, packjoboffset, and packjobidset fields available in squeue
output.
* Modify smap command to display heterogeneous job records using "#+#"
format.
* Add srun --pack-group and --mpi-combine options to control job step
launch behaviour (not fully implemented).
* Add pack job component ID to srun --label output (e.g. "P0 1:" for
job component 0 and task 1).
* jobcomp/elasticsearch: Add pack_job_id and pack_job_offset fields.
* sview: Modified to display pack job information.
* Major re-write of task state container logic to support for list of
containers rather than one container per srun command.
* Add some regression tests.
* Add srun pack job environment variables when performing job allocation.
-- Set Reason=dependency over Reason=JobArrayTaskLimit for pending jobs.
-- Add slurm.conf configuration parameters SlurmctldSyslogDebug and
SlurmdSyslogDebug to control which messages from the slurmctld and slurmd
daemons get written to syslog.
-- Add slurmdbd.conf configuration parameter DebugLevelSyslog to control which
messages from the slurmdbd daemon get written to syslog.
-- Fix handling of GroupUpdateForce option.
-- Work for heterogeneous job support (complete solution in v17.11):
* Add support to sched/backfill for concurrent allocation of all pack job
components including support of --time-min option.
* Defer initiation of a heterogeneous job until a components can be started
at the same time, taking into consideration association and QOS limits
for the job as a whole.
* Perform limit check on heterogeneous job as a whole at submit time to
reject jobs that will never be able to run.
* Add pack_job_id and pack_job_offset to accounting database.
* Modified sacct to accept pack job ID specification using "#+#" notation.
* Modified sstat to accept pack job ID specification using "#+#" notation.
-- Clear a job's "wait reason" value of BeginTime" after that time has passed.
Previously a readon of "BeginTime" could be reported long after the job's
requested begin time had passed.
-- Split group_info in slurm_ctl_conf_t into group_force and group_time.
-- Work for heterogeneous job support (complete solution in v17.11):
* Fix I/O race condition on step termination for srun launching multiple
pack job groups.
* If prolog is running when attempting to signal a step, then return EAGAIN
and retry rather than simply returning SLURM_ERROR and aborting.
* Modify launch/slurm plugin to signal all components of a pack job rather
than just the one (modify to use a list of step context records).
* Add logic to support srun --mpi-combine option.
* Set up debugger data structures.
* Disable cancellation of individual component while the job is pending.
* Modify scontrol job hold/release and update to operate with heterogeneous
job id specification (e.g. "scontrol hold 123+4").
* If srun lacks application specification for some component, the next one
specified will be used for earlier components.
* Changes in Slurm 17.11.0pre1
==============================
-- Interpet all format options in output/error file to log prolog errors. Prior
logic only supported "%j" (job ID) option.
-- Add the configure option --with-shared-libslurm which will link to
libslurm.so instead of libslurm.o thus reducing the footprint of all the
binaries.
-- In switch plugin, added plugin_id symbol to plugins and wrapped
switch_jobinfo_t with dynamic_plugin_data_t in interface calls in
order to pass switch information between clusters with different switch
types.
-- Switch naming of acct_gather_infiniband to acct_gather_interconnect
-- Make it so you can "stack" the interconnect plugins.
-- Add a last_sched_eval timestamp to record when a job was last evaluated
by the main scheduler or backfill.
-- Add scancel "--hurry" option to avoid staging out any burst buffer data.
-- Simplify the sched plugin interface.
-- Add new advanced reservation flags of "weekday" (repeat on each weekday;
Monday through Friday) and "weekend" (repeat on each weekend day; Saturday
and Sunday).
-- Add new advanced reservation flag of "flex", which permits jobs requesting
the reservation to begin prior to the reservation's start time and use
resources inside or outside of the reservation. A typical use case is to
prevent jobs not explicitly requesting the reservation from using those
reserved resources rather than forcing jobs requesting the reservation to
use those resources in the time frame reserved.
-- Add NoDecay flag to QOS.
-- Node "OS" field expanded from "sysname" to "sysname release version" (e.g.
change from "Linux" to
"Linux 4.8.0-28-generic #28-Ubuntu SMP Sat Feb 8 09:15:00 UTC 2017").
-- jobcomp/elasticsearch - Add "job_name" and "wc_key" fields to stored
information.
-- jobcomp/filetxt - Add ArrayJobId, ArrayTaskId, ReservationName, Gres,
Account, QOS, WcKey, Cluster, SubmitTime, EligibleTime, DerivedExitCode and
ExitCode.
-- scontrol modified to report core IDs for reservation containing individual
cores.
-- MYSQL - Get rid of table join during rollup which speeds up the process
dramatically on large job/step tables.
-- Add ability to define features on clusters for directing federated jobs to
different clusters.
-- Add new RPC to process multiple federation RPCs in a single communication.
-- Modify slurm_load_jobs() function to load job information from all clusters
in a federation.
-- Add squeue --local and --sibling options to modify filtering of jobs on
federated clusters.
-- Add SchedulerParameters option of bf_max_job_user_part to specifiy the
maximum number of jobs per user for any single partition. This differs from
bf_max_job_user in that a separate counter is applied to each partition
rather than having a single counter per user applied to all partitions.
-- Modify backfill logic so that bf_max_job_user, bf_max_job_part and
bf_max_job_user_part options can all be used independently of each other.
-- Add sprio -p/--partition option to filter jobs by partition name.
-- Add partition name to job priority factor response message.
-- Add sprio --local and --sibling options for use in federation of clusters.
-- Add sprio "%c" format to print cluster name in federation mode.
-- Modify sinfo logic to provided unified view of all nodes and partitions
in a federation, add --local option to only report local state information
even in a cluster, print cluster name with "%V" format option, and
optionally sort by cluster name.
-- If a task in a parallel job fails and it was launched with the
--kill-on-bad-exit option then terminate the remaining tasks using the
SIGCONT, SIGTERM and SIGKILL signals rather than just sending SIGKILL.
-- Include submit_time when doing the sort for job scheduling.
-- Modify sacct to report all jobs in federation by default. Also add --local
option.
-- Modify sacct to accept "--cluster all" option (in addition to the old
"--cluster -1", which is still accepted).
-- Modify sreport to report all jobs in federation by default. Also add --local
option.
-- sched/backfill: Improve assoc_limit_stop configuration parameter support.
-- KNL features: Always keep active and available features in the same order:
first site-specific features, next MCDRAM modes, last NUMA modes.
-- Changed default ProctrackType to cgroup.
-- Add "cluster_name" field to node_info_t and partition_info_t data structure.
It is filled in only when the cluster is part of a federation and
SHOW_FEDERATION flag used.
-- Functions slurm_load_node() slurm_load_partitions() modified to show all
nodes/partitions in a federation when the SHOW_FEDERATION flag is used.
-- Add federated views to sview.
-- Add --federation option to sacct, scontrol, sinfo, sprio, squeue, sreport to
show a federated view. Will show local view by default.
-- Add FederationParameters=fed_display slurm.conf option to configure status
commands to display a federated view by default if the cluster is a member
of a federation.
-- Log the down nodes whenever slurmctld restarts.
-- Report that "CPUs" plus "Boards" in node configuration invalid only if the
CPUs value is not equal to the total thread count.
-- Extend the output of the seff utility to also include the job's wall-clock
time.
-- Add bf_max_time to SchedulerParameters.
-- Add bf_max_job_assoc to SchedulerParameters.
-- Add new SchedulerParameters option bf_window_linear to control the rate at
which the backfill test window expands. This can be used on a system with
a modest number of running jobs (hundreds of jobs) to help prevent expected
start times of pending jobs to get pushed forward in time. On systems with
large numbers of running jobs, performance of the backfill scheduler will
suffer and fewer jobs will be evaluated.
-- Improve scheduling logic with respect to license use and node reboots.
-- CRAY - Alter algorithm to come up with the SLURM_ID_HASH.
-- Implement federated scheduling and federated status outputs.
-- The '-q' option to srun has changed from being the short form of
'--quit-on-interrupt' to '--qos'.
-- Change sched_min_interval default from 0 to 2 microseconds.
* Changes in Slurm 17.02.11
===========================
-- Fix insecure handling of user_name and gid fields. CVE-2018-10995.
* Changes in Slurm 17.02.10
===========================
-- Fix updating of requested TRES memory.
-- Cray modulefile: avoid removing /usr/bin from path on module unload.
-- Fix issue when resetting the partition pointers on nodes.
-- Show reason field in 'sinfo -R' when nodes is marked as failed.
-- Fix potential of slurmstepd segfaulting when the extern step fails to start.
-- Allow nodes state to be updated between FAIL and DRAIN.
-- Avoid registering a job'd credential multiple times.
-- Fix sbatch --wait to stop waiting after job is gone from memory.
-- Fix memory leak of MailDomain configuration string when slurmctld daemon is
reconfigured.
-- Fix to properly remove extern steps from the starting_steps list.
-- Fix Slurm to work correctly with HDF5 1.10+.
-- Add support in salloc/srun --bb option for "access_mode" in addition to
"access" for consistency with DW options.
-- Fix potential deadlock in _run_prog() in power save code.
-- MYSQL - Add dynamic_offset in the database to force range for auto
increment ids for the tres_table.
-- Avoid setting node in COMPLETING state indefinitely if the job initiating
the node reboot is cancelled while the reboot in in progress.
-- node_feature/knl_cray - Fix memory leaks that occur when slurmctld
reconfigured.
-- node_feature/knl_cray - Fix memory leak that can occur during normal
operation.
-- Fix job array dependency with "aftercorr" option and some task arrays in
the first job fail. This fix lets all task array elements that can run
proceed rather than stopping all subsequent task array elements.
-- Fix whole node allocation cpu counts when --hint=nomultihtread.
-- NRT - Fix issue when running on a HFI (p775) system with multiple protocols.
-- Fix uninitialized variables when unpacking slurmdb_archive_cond_t.
-- Fix security issue in accounting_storage/mysql plugin by always escaping
strings within the slurmdbd. CVE-2018-7033.
* Changes in Slurm 17.02.9
==========================
-- When resuming powered down nodes, mark DOWN nodes right after ResumeTimeout
has been reached (previous logic would wait about one minute longer).
-- Fix sreport not showing full column name for TRES Count.
-- Fix slurmdb_reservations_get() giving wrong usage data when job's spanned
reservation that was modified.
-- Fix sreport reservation utilization report showing bad data.
-- Show all TRES' on a reservation in sreport reservation utilization report by
default.
-- Fix sacctmgr show reservation handling "end" parameter.
-- Work around issue with sysmacros.h and gcc7 / glibc 2.25.
-- Fix layouts code to only allow setting a boolean.
-- Fix sbatch --wait to keep waiting even if a message timeout occurs.
-- CRAY - If configured with NodeFeatures=knl_cray and there are non-KNL
nodes which include no features the slurmctld will abort without
this patch when attemping strtok_r(NULL).
-- Fix regression in 17.02.7 which would run the spank_task_privileged as
part of the slurmstepd instead of it's child process.
-- Fix security issue in Prolog and Epilog by always prepending SPANK_ to
all user-set environment variables. CVE-2017-15566.
* Changes in Slurm 17.02.8
==========================
-- Add 'slurmdbd:' to the accounting plugin to notify message is from dbd
instead of local.
-- mpi/mvapich - Buffer being only partially cleared. No failures observed.
-- Fix for job --switch option on dragonfly network.
-- In salloc with --uid option, drop supplementary groups before changing UID.
-- jobcomp/elasticsearch - strip any trailing slashes from JobCompLoc.
-- jobcomp/elasticsearch - fix memory leak when transferring generated buffer.
-- Prevent slurmstepd ABRT when parsing gres.conf CPUs.
-- Fix sbatch --signal to signal all MPI ranks in a step instead of just those
on node 0.
-- Check multiple partition limits when scheduling a job that were previously
only checked on submit.
-- Cray: Avoid running application/step Node Health Check on the external
job step.
-- Optimization enhancements for partition based job preemption.
-- Address some build warnings from GCC 7.1, and one possible memory leak if
/proc is inaccessible.
-- If creating/altering a core based reservation with scontrol/sview on a
remote cluster correctly determine the select type.
-- Fix autoconf test for libcurl when clang is used.
-- Fix default location for cgroup_allowed_devices_file.conf to use correct
default path.
-- Document NewName option to sacctmgr.
-- Reject a second PMI2_Init call within a single step to prevent slurmstepd
from hanging.
-- Handle old 32bit values stored in the database for requested memory
correctly in sacct.
-- Fix memory leaks in the task/cgroup plugin when constraining devices.
-- Make extremely verbose info messages debug2 messages in the task/cgroup
plugin when constraining devices.
-- Fix issue that would deny the stepd access to /dev/null where GRES has a
'type' but no file defined.
-- Fix issue where the slurmstepd would fatal on job launch if you have no
gres listed in your slurm.conf but some in gres.conf.
-- Fix validating time spec to correctly validate various time formats.
-- Make scontrol work correctly with job update timelimit [+|-]=.
-- Reduce the visibily of a number of warnings in _part_access_check.
-- Prevent segfault in sacctmgr if no association name is specified for
an update command.
-- burst_buffer/cray plugin modified to work with changes in Cray UP05
software release.
-- Fix job reasons for jobs that are violating assoc MaxTRESPerNode limits.
-- Fix segfault when unpacking a 16.05 slurm_cred in a 17.02 daemon.
-- Fix setting TRES limits with case insensitive TRES names.
-- Add alias for xstrncmp() -- slurm_xstrncmp().
-- Fix sorting of case insensitive strings when using xstrcasecmp().
-- Gracefully handle race condition when reading /proc as process exits.
-- Avoid error on Cray duplicate setup of core specialization.
-- Skip over undefined (hidden in Slurm) nodes in pbsnodes.
-- Add empty hashes in perl api's slurm_load_node() for hidden nodes.
-- CRAY - Add rpath logic to work for the alpscomm libs.
-- Fixes for administrator extended TimeLimit (job reason & time limit reset).
-- Fix gres selection on systems running select/linear.
-- sview: Added window decorator for maximize,minimize,close buttons for all
systems.
-- squeue: interpret negative length format specifiers as a request to
delimit values with spaces.
-- Fix the torque pbsnodes wrapper script to parse a gres field with a type
set correctly.
* Changes in Slurm 17.02.7
==========================
-- Fix deadlock if requesting to create more than 10000 reservations.
-- Fix potential memory leak when creating partition name.
-- Execute the HealthCheckProgram once when the slurmd daemon starts rather
than executing repeatedly until an exit code of 0 is returned.
-- Set job/step start and end times to 0 when using --truncate and start > end.
-- Make srun --pty option ignore EINTR allowing windows to resize.
-- When resuming node only send one message to the slurmdbd.
-- Modify srun --pty option to use configured SrunPortRange range.
-- Fix issue with whole gres not being printed out with Slurm tools.
-- Fix issue with multiple jobs from an array are prevented from starting.
-- Fix for possible slurmctld abort with use of salloc/sbatch/srun
--gres-flags=enforce-binding option.
-- Fix race condition when using jobacct_gather/cgroup where the memory of the
step wasn't always gathered correctly.
-- Better debug when slurmdbd queue is filling up in the slurmctld.
-- Fixed truncation on scontrol show config output.
-- Serialize updates from from the dbd to the slurmctld.
-- Fix memory leak in slurmctld when agent queue to the DBD has filled up.
-- CRAY - Throttle step creation if trying to create too many steps at once.
-- If failing after switch_g_job_init happened make sure switch_g_job_fini is
called.
-- Fix minor memory leak if launch fails in the slurmstepd.
-- Fix issue where UnkillableStepProgram if step was in an ending state.
-- Fix bug when tracking multiple simultaneous spawned ping cycles.
-- jobcomp/elasticsearch plugin now saves state of pending requests on
slurmctld daemon shutdown so then can be recovered on restart.
-- Fix issue when an alternate munge key when communicating on a persistent
connection.
-- Document inconsistent behavior of GroupUpdateForce option.
-- Fix bug in selection of GRES bound to specific CPUs where the GRES count
is 2 or more. Previous logic could allocate CPUs not available to the job.
-- Increase buffer to handle long /proc/<pid>/stat output so that Slurm can
read correct RSS value and take action on jobs using more memory than
requested.
-- Fix srun job jobs that can run immediately to run in the highest priority
partion when multiple partitions are listed. scontrol show jobs can
potentially show the partition list in priority order.
-- Fix starting controller if StateSaveLocation path didn't exist.
-- Fix inherited association 'max' TRES limits combining multiple limits in
the tree.
-- Sort TRES id's on limits when getting them from the database.
-- Fix issue with pmi[2|x] when TreeWidth=1.
-- Correct buffer size used in determining specialized cores to avoid possible
truncation of core specification and not reserving the specified cores.
-- Close race condition on Slurm structures when setting DebugFlags.
-- Make it so the cray/switch plugin grabs new DebugFlags on a reconfigure.
-- Fix incorrect lock levels when creating or updating a reservation.
-- Fix overlapping reservation resize.
-- Add logic to help support Dell KNL systems where syscfg is different than
the normal Intel syscfg.
-- CRAY - Fix BB to handle type= correctly, regression in 17.02.6.
* Changes in Slurm 17.02.6
==========================
-- Fix configurator.easy.html to output the SelectTypeParameters line.
-- If a job requests a specific memory requirement then gets something else
from the slurmctld make sure the step allocation is made aware of it.
-- Fix missing initialization in slurmd.
-- Fix potential degradation when running HTC (> 100 jobs a sec) like
workflows through the slurmd.
-- Fix race condition which could leave a stepd hung on shutdown.
-- CRAY - Add configuration for ATP to the ansible play script.
-- Fix potential to corrupt DBD message.
-- burst_buffer logic modified to support sizes in both SI and EIC size units
(e.g. M/MiB for powers of 1024, MB for powers of 1000).
* Changes in Slurm 17.02.5
==========================
-- Prevent segfault if a job was blocked from running by a QOS that is then
deleted.
-- Improve selection of jobs to preempt when there are multiple partitions
with jobs subject to preemption.
-- Only set kmem limit when ConstrainKmemSpace=yes is set in cgroup.conf.
-- Fix bug in task/affinity that could result in slurmd fatal error.
-- Increase number of jobs that are tracked in the slurmd as finishing at one
time.
-- Note when a job finishes in the slurmd to avoid a race when launching a
batch job takes longer than it takes to finish.
-- Improve slurmd startup on large systems (> 10000 nodes)
-- Add LaunchParameters option of cray_net_exclusive to control whether all
jobs on the cluster have exclusive access to their assigned nodes.
-- Make sure srun inside an allocation gets --ntasks-per-[core|socket]
set correctly.
-- Only make the extern step at job creation.
-- Fix for job step task layout with --cpus-per-task option.
-- Fix --ntasks-per-core option/environment variable parsing to set
the requested value, instead of always setting one (srun).
-- Correct error message when ClusterName in configuration files does not match
the name in the slurmctld daemon's state save file.
-- Better checking when a job is finishing to avoid underflow on job's
submitted to a QOS/association.
-- Handle partition QOS submit limits correctly when a job is submitted to
more than 1 partition or when the partition is changed with scontrol.
-- Performance boost for when Slurm is dealing with credentials.
-- Fix race condition which could leave a stepd hung on shutdown.
-- Add lua support for opensuse.
* Changes in Slurm 17.02.4
==========================
-- Do not attempt to schedule jobs after changing the power cap if there are
already many active threads.
-- Job expansion example in FAQ enhanced to demonstrate operation in
heterogeneous environments.
-- Prevent scontrol crash when operating on array and no-array jobs at once.
-- knl_cray plugin: Log incomplete capmc output for a node.
-- knl_cray plugin: Change capmc parsing of mcdram_pct from string to number.
-- Remove log files from test20.12.
-- When rebooting a node and using the PrologFlags=alloc make sure the
prolog is ran after the reboot.
-- node_features/knl_generic - If a node is rebooted for a pending job, but
fails to enter the desired NUMA and/or MCDRAM mode then drain the node and
requeue the job.
-- node_features/knl_generic disable mode change unless RebootProgram
configured.
-- Add new burst_buffer function bb_g_job_revoke_alloc() to be executed
if there was a failure after the initial resource allocation. Does not
release previously allocated resources.
-- Test if the node_bitmap on a job is NULL when testing if the job's nodes
are ready. This will be NULL is a job was revoked while beginning.
-- Fix incorrect lock levels when testing when job will run or updating a job.
-- Add missing locks to job_submit/pbs plugin when updating a jobs
dependencies.
-- Add support for lua5.3
-- Add min_memory_per_node|cpu to the job_submit/lua plugin to deal with lua
not being able to deal with pn_min_memory being a uint64_t. Scripts are
urged to change to these new variables avoid issue. If not set the
variables will be 'nil'.
-- Calculate priority correctly when 'nice' is given.
-- Fix minor typos in the documentation.
-- node_features/knl_cray: Preserve non-KNL active features if slurmctld
reconfigured while node boot in progress.
-- node_features/knl_generic: Do not repeatedly log errors when trying to read
KNL modes if not KNL system.
-- Add missing QOS read lock to backfill scheduler.
-- When doing a dlopen on liblua only attempt the version compiled against.
-- Fix null-dereference in sreport cluster ulitization when configured with
memory-leak-debug.
-- Fix Partition info in 'scontrol show node'. Previously duplicate partition
names, or Partitions the node did not belong to could be displayed.
-- Fix it so the backup slurmdbd will take control correctly.
-- Fix unsafe use of MAX() macro, which could result in problems cleaning up
accounting plugins in slurmd, or repeat job cancellation attempts in
scancel.
-- Fix 'scontrol update reservation duration=unlimited' to set the duration
to 365-days (as is done elsewhere), rather than 49710 days.
-- Check if variable given to scontrol show job is a valid jobid.
-- Fix WithSubAccounts option to not include WithDeleted unless requested.
-- Prevent a job tested on multiple partitions from being marked
WHOLE_NODE_USER.
-- Prevent a race between completing jobs on a user-exclusive node from
leaving the node owned.
-- When scheduling take the nodes in completing jobs out of the mix to reduce
fragmentation. SchedulerParameters=reduce_completing_frag
-- For jobs submited to multiple partitions, report the job's earliest start
time for any partition.
-- Backfill partitions that use QOS Grp limits to "float" better.
-- node_features/knl_cray: don't clear configured GRES from non-KNL node.
-- sacctmgr - prevent segfault in command when a request is denied due
to a insufficient priviledges.
-- Add warning about libcurl-devel not being installed during configure.
-- Streamline job purge by handling file deletion on a separate thread.
-- Always set RLIMIT_CORE to the maximum permitted for slurmd, to ensure
core files are created even on non-developer builds.
-- Fix --ntasks-per-core option/environment variable parsing to set
the requested value, instead of always setting one.
-- If trying to cancel a step that hasn't started yet for some reason return
a good return code.
-- Fix issue with sacctmgr show where user=''
* Changes in Slurm 17.02.3
==========================
-- Increase --cpu_bind and --mem_bind field length limits.
-- Fix segfault when using AdminComment field with job arrays.
-- Clear Dependency field when all dependencies are satisfied.
-- Add --array-unique to squeue which will display one unique pending job
array element per line.
-- Reset backfill timers correctly without skipping over them in certain
circumstances.
-- When running the "scontrol top" command, make sure that all of the user's
jobs have a priority that is lower than the selected job. Previous logic
would permit other jobs with equal priority (no jobs with higher priority).
-- Fix perl api so we always get an allocation when calling Slurm::new().
-- Fix issue with cleaning up cpuset and devices cgroups when multiple steps
end at the same time.
-- Document that PriorityFlags option of DEPTH_OBLIVIOUS precludes the use of
FAIR_TREE.
-- Fix issue if an invalid message came in a Slurm daemon/command may abort.
-- Make it impossible to use CR_CPU* along with CR_ONE_TASK_PER_CORE. The
options are mutually exclusive.
-- ALPS - Fix scheduling when ALPS doesn't agree with Slurm on what nodes
are free.
-- When removing a partition make sure it isn't part of a reservation.
-- Fix seg fault if loading attempting to load non-existent burstbuffer plugin.
-- Fix to backfill scheduling with respect to QOS and association limits. Jobs
submitted to multiple partitions are most likley to be effected.
-- sched/backfill: Improve assoc_limit_stop configuration parameter support.
-- CRAY - Add ansible play and README.
-- sched/backfill: Fix bug related to advanced reservations and the need to
reboot nodes to change KNL mode.
-- Preempt plugins - fix check for 'preempt_youngest_first' option.
-- Preempt plugins - fix incorrect casts in preempt_youngest_first mode.
-- Preempt/job_prio - fix incorrect casts in sort function.
-- Fix to make task/affinity work with ldoms where there are more than 64
cpus on the node.
-- When using node_features/knl_generic make it so the slurmd doesn't segfault
when shutting down.
-- Fix potential double-xfree() when using job arrays that can lead to
slurmctld crashing.
-- Fix priority/multifactor priorities on a slurmctld restart if not using
accounting_storage/[mysql|slurmdbd].
-- Fix NULL dereference reported by CLANG.
-- Update proctrack documentation to strongly encourage use of
proctrack/cgroup.
-- Fix potential memory leak if job fails to begin after nodes have been
selected for a job.
-- Handle a job that made it out of the select plugin without a job_resrcs
pointer.
-- Fix potential race condition when persistent connections are being closed at
shutdown.
-- Fix incorrect locks levels when submitting a batch job or updating a job
in general.
-- CRAY - Move delay waiting for job cleanup to after we check once.
-- MYSQL - Fix memory leak when loading archived jobs into the database.
-- Fix potential race condition when starting the priority/multifactor plugin's
decay thread.
-- Sanity check to make sure we have started a job in acct_policy.c before we
clear it as started.
-- Allow reboot program to use arguments.
-- Message Aggr - Remove race condition on slurmd shutdown with respects to
destroying a mutex.
-- Fix updating job priority on multiple partitions to be correct.
-- Don't remove admin comment when updating a job.
-- Return error when bad separator is given for scontrol update job licenses.
* Changes in Slurm 17.02.2
==========================
-- Update hyperlink to LBNL Node Health Check program.
-- burst_buffer/cray - Add support for line continuation.
-- If a job is cancelled by the user while it's allocated nodes are being
reconfigured (i.e. the capmc_resume program is rebooting nodes for the job)
and the node reconfiguration fails (i.e. the reboot fails), then don't
requeue the job but leave it in a cancelled state.
-- capmc_resume (Cray resume node script) - Do not disable changing a node's
active features if SyscfgPath is configured in the knl.conf file.
-- Improve the srun documentation for the --resv-ports option.
-- burst_buffer/cray - Fix parsing for discontinuous allocated nodes. A job
allocation of "20,22" must be expressed as "20\n22".
-- Fix rare segfault when shutting down slurmctld and still sending data to
the database.
-- Fix gres output of a job if it is updated while pending to be displayed
correctly with Slurm tools.
-- Fix pam_slurm_adopt.
-- Fix missing unlock when job_list doesn't exist when starting priority/
multifactor.
-- Fix segfault if slurmctld is shutting down and the slurmdbd plugin was
in the middle of setting db_indexes.
-- Add ESLURM_JOB_SETTING_DB_INX to errno to note when a job can't be updated
because the dbd is setting a db_index.
-- Fix possible double insertion into database when a job is updated at the
moment the dbd is assigning a db_index.
-- Fix memory error when updating a job's licenses.
-- Fix seff to work correctly with non-standard perl installs.
-- Export missing slurmdbd_defs_[init|fini] needed for libslurmdb.so to work.
-- Fix sacct from returning way more than requested when querying against a job
array task id.
-- Fix double read lock of tres when updating gres or licenses on a job.
-- Make sure locks are always in place when calling
assoc_mgr_make_tres_str_from_array.
-- Prevent slurmctld SEGV when creating reservation with duplicated name.
-- Consider QOS flags Partition[Min|Max]Nodes when doing backfill.
-- Fix slurmdbd_defs.c to not have half symbols go to libslurm.so and the
other half go to libslurmdb.so.
-- Fix 'scontrol show jobs' to remove an errant newline when 'Switches' is
printed.
-- Better code for handling memory required by a task on a heterogeneous
system.
-- Fix regression in 17.02.0 with respects to GrpTresMins on a QOS or
Association.
-- Cleanup to make make dist work.
-- Schedule interactive jobs quicker.
-- Perl API - correct value of MEM_PER_CPU constant to correctly handle
memory values.
-- Fix 'flags' variable to be 32 bit from the old 16 bit value in the perl api.
-- Export sched_nodes for a job in the perl api.
-- Improve error output when updating a reservation that has already started.
-- Fix --ntasks-per-node issue with srun so DenyOnLimit would work correctly.
-- node_features/knl_cray plugin - Fix memory leak.
-- Fix wrong cpu_per_task count issue on heterogeneous system when dealing with
steps.
-- Fix double free issue when removing usage from an association with sacctmgr.
-- Fix issue with SPANK plugins attempting to set null values as environment
variables, which leads to the command segfaulting on newer glibc versions.
-- Fix race condition on slurmctld startup when plugins have not gone through
init() ahead of the rpc_manager processing incoming messages.
-- job_submit/lua - expose admin_comment field.
-- Allow AdminComment field to be set by the job_submit plugin.
-- Allow AdminComment field to be changed by any Administrator.
-- Fix key words in jobcomp select.
-- MYSQL - Streamline job flush sql when doing a clean start on the slurmctld.
-- Fix potential infinite loop when talking to the DBD when shutting down
the slurmctld.
-- Fix MCS filter.
-- Make it so pmix can be included in the plugin rpm without having to
specify --with-pmix.
-- MYSQL - Fix initial load when not using he DBD.
-- Fix scontrol top to not make jobs priority 0 (held).
-- Downgrade info message about exceeding partition time limit to a debug2.
* Changes in Slurm 17.02.1-2
============================
-- Replace clock_gettime with time(NULL) for very old systems without the call.
* Changes in Slurm 17.02.1
==========================
-- Modify pam module to work when configured NodeName and NodeHostname differ.
-- Update to sbatch/srun man pages to explain the "filename pattern" clearer
-- Add %x to sbatch/srun filename pattern to represent the job name.
-- job_submit/lua - Add job "bitflags" field.
-- Update slurm.spec file to note obsolete RPMs.
-- Fix deadlock scenario when dumping configuration in the slurmctld.
-- Remove unneeded job lock when running assoc_mgr cache. This lock could
cause potential deadlock when/if TRES changed in the database and the
slurmctld wasn't made aware of the change. This would be very rare.
-- Fix missing locks in gres logic to avoid potential memory race.
-- If gres is NULL on a job don't try to process it when returning detailed
information about a job to scontrol.
-- Fix print of consumed energy in sstat when no energy is being collected.
-- Print formatted tres string when creating/updating a reservation.
-- Fix issues with QOS flags Partition[Min|Max]Nodes to work correctly.
-- Prevent manipulation of the cpu frequency and governor for batch or
extern steps. This addresses an issue where the batch step would
inadvertently set the cpu frequency maximum to the minimum value
supported on the node.
-- Convert a slurmctd power management data structure from array to list in
order to eliminate the possibility of zombie child suspend/resume
processes.
-- Burst_buffer/cray - Prevent slurmctld daemon abort if "paths" operation
fails. Now job will be held. Update job update time when held.
-- Fix issues with QOS flags Partition[Min|Max]Nodes to work correctly.
-- Refactor slurmctld agent logic to eliminate some pthreads.
-- Added "SyscfgTimeout" parameter to knl.conf configuration file.
-- Fix for CPU binding for job steps run under a batch job.
* Changes in Slurm 17.02.0
==========================
-- job_submit/lua - Make "immediate" parameter available.
-- Fix srun I/O race condtion to eliminate a error message that might be
generated if the application exits with outstanding stdin.
-- Fix regression when purging/archiving jobs/events.
-- Add new job state JOB_OOM indicating Out Of Memory condition as detected
by task/cgroup plugin.
-- If QOS has been added to the system go refigure out Deny/AllowQOS on
partitions.
-- Deny job with duplicate GRES requested.
-- Fix loading super old assoc_mgr usage without segfaulting.
-- CRAY systems: Restore TaskPlugins order of task/cray before task/cgroup.
-- Task/cray: Treat missing "mems" cgroup with "debug" messages rather than
"error" messages. The file may be missing at step termination due to a
change in how cgroups are released at job/step end.
-- Fix for job constraint specification with counts, --ntasks-per-node value,
and no node count.
-- Fix ordering of step task allocation to fill in a socket before going into
another one.
-- Fix configure to not require C++
-- job_submit/lua - Remove access to slurmctld internal reservation fields of
job_pend_cnt and job_run_cnt.
-- Prevent job_time_limit enforcement from blocking other internal operations
if a large number of jobs need to be cancelled.
-- Add 'preempt_youngest_order' option to preempt/partition_prio plugin.
-- Fix controller being able to talk to a pre-released DBD.
-- Added ability to override the invoking uid for "scontrol update job"
by specifying "--uid=<uid>|-u <uid>".
-- Changed file broadcast "offset" from 32 to 64 bits in order to support files
over 2 GB.
-- slurm.spec - do not install init scripts alongside systemd service files.
* Changes in Slurm 17.02.0rc1
==============================
-- Add port info to 'sinfo' and 'scontrol show node'.
-- Fix errant definition of USE_64BIT_BITSTR which can lead to core dumps.
-- Move BatchScript to end of each job's information when using
"scontrol -dd show job" to make it more readable.
-- Add SchedulerParameters configuration parameter of "default_gbytes", which
treats numeric only (no suffix) value for memory and tmp disk space as being
in units of Gigabytes. Mostly for compatability with LSF.
-- Fix race condtion in srun/sattach logic which would prevent srun from
terminating.
-- Bitstring operations are now 64bit instead of 32bit.
-- Replace hweight() function in bitstring with faster version.
-- scancel would treat a non-numeric argument as the name of jobs to be
cancelled (a non-documented feature). Cancelling jobs by name now require
the "--jobname=" command line argument.
-- scancel modified to note that no jobs satisfy the filter options when the
--verbose option is used along with one or more job filters (e.g. "--qos=").
-- Change _pack_cred to use pack_bit_str_hex instead of pack_bit_fmt for
better scalability and performance.
-- Add BootTime configuration parameter to knl.conf file to optimize resource
allocations with respect to required node reboots.
-- Add node_features_p_boot_time() to node_features plugin to optimize
scheduling with respect to node reboots.
-- Avoid allocating resources to a job in the event that its run time plus boot
time (if needed) extent into an advanced reservation.
-- Burst_buffer/cray - Avoid stage-out operation if job never started.
-- node_features/knl_cray - Add capability to detected Uncorrectable Memory
Errors (UME) and if detected then log the event in all job and step stderr
with a message of the form:
error: *** STEP 1.2 ON tux1 UNCORRECTABLE MEMORY ERROR AT 2016-12-14T09:09:37 ***
Similar logic added to node_features/knl_generic in version 17.02.0pre4.
-- If job is allocated nodes which are powered down, then reset job start time
when the nodes are ready and do not charge the job for power up time.
-- Add the ability to purge transactions from the database.
-- Add support for requeue'ing of federated jobs (BETA).
-- Add support for interactive federated jobs (BETA).
-- Add the ability to purge rolled up usage from the database.
-- Properly set SLURM_JOB_GPUS environment variable for Prolog.
* Changes in Slurm 17.02.0pre4
==============================
-- Add support for per-partitiion OverTimeLimit configuration.
-- Add --mem_bind option of "sort" to run zonesort on KNL nodes at step start.
-- Add LaunchParameters=mem_sort option to configure running of zonesort
by default at step startup.
-- Add "FreeSpace" information for each pool to the "scontrol show burstbuffer"
output. Required changes to the burst_buffer_info_t data structure.
-- Add new node state flag of NODE_STATE_REBOOT for node reboots triggered by
"scontrol reboot" commands. Previous logic re-used NODE_STATE_MAINT flag,
which could lead to inconsistencies. Add "ASAP" option to "scontrol reboot"
command that will drain a node in order to reboot it as soon as possible,
then return it to service.
-- Allow unit conversion routine to convert 1024M to 1G.
-- switch/cray plugin - change legacy spool directory location.
-- Add new PriorityFlags option of INCR_ONLY, which prevents a job's priority
from being decremented.
-- Make it so we don't purge job start messages until after we purge step
messages. Hopefully this will reduce the number of messages lost when
filling up memory when the database/DBD is down.
-- Added SchedulingParameters option of "bf_job_part_count_reserve". Jobs below
the specified threshold will not have resources reserved for them.
-- If GRES are configured with file IDs, then "scontrol -d show node" will
not only identify the count of currently allocated GRES, but their specific
index numbers (e.g. "GresUsed=gpu:alpha:2(IDX:0,2),gpu:beta:0(IDX:N/A)").
Ditto for job information with "scontrol -d show job".
-- Add new mcs/account plugin.
-- Add "GresEnforceBind=Yes" to "scontrol show job" output if so configured.
-- Add support for SALLOC_CONSTRAINT, SBATCH_CONSTRAINT and SLURM_CONSTRAINT
environment variables to set default constraints for salloc, sbatch and
srun commands respectively.
-- Provide limited support for the MemSpecLimit configuration parameter without
the task/cgroup plugin.
-- node_features/knl_generic - Add capability to detected Uncorrectable Memory
Errors (UME) and if detected then log the event in all job and step stderr
with a message of the form:
error: *** STEP 1.2 ON tux1 UNCORRECTABLE MEMORY ERROR AT 2016-12-14T09:09:37 ***
-- Add SLURM_JOB_GID to TaskProlog environment.
-- burst_buffer/cray - Remove leading zeros from node ID lists passed to
dw_wlm_cli program.
-- Add "Partitions" field to "scontrol show node" output.
-- Remove sched/wiki and sched/wiki2 plugins and associated code.
-- Remove SchedulerRootFilter option and slurm_get_root_filter() API call.
-- Add SchedulerParameters option of spec_cores_first to select specialized
cores from the lowest rather than highest number cores and sockets.
-- Add PrologFlags option of Serial to disable concurrent launch of
Prolog and Epilog scripts.
-- Fix security issue caused by insecure file path handling triggered by the
failure of a Prolog script. To exploit this a user needs to anticipate or
cause the Prolog to fail for their job. CVE-2016-10030.
* Changes in Slurm 17.02.0pre3
==============================
-- Add srun host & PID to job step data structures.
-- Avoid creating duplicate pending step records for the same srun command.
-- Rewrite srun's logic for pending steps for better efficiency (fewer RPCs).
-- Added new SchedulerParameters options step_retry_count and step_retry_time
to control scheduling behaviour of job steps waiting for resources.
-- Optimize resource allocation logic for --spread-job job option.
-- Modify cpu_bind and mem_bind map and mask options to accept a repetition
count to better support large task count. For example:
"mask_mem:0x0f*2,0xf0*2" is equivalent to "mask_mem:0x0f,0x0f,0xf0,0xf0".
-- Add support for --mem_bind=prefer option to prefer, but not restrict memory
use to the identified NUMA node.
-- Add mechanism to constrain kernel memory allocation using cgroups. New
cgroup.conf parameters added: ConstrainKmemSpace, MaxKmemPercent, and
MinKmemSpace.
-- Correct invokation of man2html, which previously could cause FreeBSD builds
to hang.
-- MYSQL - Unconditionally remove 'ignore' clause from 'alter ignore'.
-- Modify service files to not start Slurm daemons until after Munge has been
started.
NOTE: If you are not using Munge, but are using the "service" scripts to
start Slurm daemons, then you will need to remove this check from the
etc/slurm*service scripts.
-- Do not process SALLOC_HINT, SBATCH_HINT or SLURM_HINT environment variables
if any of the following salloc, sbatch or srun command line options are
specified: -B, --cpu_bind, --hint, --ntasks-per-core, or --threads-per-core.
-- burst_buffer/cray: Accept new jobs on backup slurmctld daemon without access
to dw_wlm_cli command. No burst buffer actions will take place.
-- Do not include SLURM_JOB_DERIVED_EC, SLURM_JOB_EXIT_CODE, or
SLURM_JOB_EXIT_CODE in PrologSlurmctld environment (not available yet).
-- Cray - set task plugin to fatal() if task/cgroup is not loaded after
task/cray in the TaskPlugin settings.
-- Remove separate slurm_blcr package. If Slurm is built with BLCR support,
the files will now be part of the main Slurm packages.
-- Replace sjstat, seff and sjobexit RPM packages with a single "contribs"
package.
-- Remove long since defunct slurmdb-direct scripts.
-- Add SbcastParameters configuration option to control default file
destination directory and compression algorithm.
-- Add new SchedulerParameter (max_array_tasks) to limit the maximum number of
tasks in a job array independently from the maximum task ID (MaxArraySize).
-- Fix issue where number of nodes is not properly allocated when sbatch and
salloc are requested with -n tasks < hosts from -w hostlist or from -N.
-- Add infrastructure for submitting federated jobs.
* Changes in Slurm 17.02.0pre2
==============================
-- Add new RPC (REQUEST_EVENT_LOG) so that slurmd and slurmstepd can log events
through the slurmctld daemon.
-- Remove sbatch --bb option. That option was never supported.
-- Automatically clean up task/cgroup cpuset and devices cgroups after steps
are completed.
-- Add federation read/write locks.
-- Limit job purge run time to 1 second at a time.
-- The database index for jobs is now 64 bits. If you happen to be close to
4 billion jobs in your database you will want to update your slurmctld at
the same time as your slurmdbd to prevent roll over of this variable as
it is 32 bit previous versions of Slurm.
-- Optionally lock slurmstepd in memory for performance reasons and to avoid
possible SIGBUS if the daemon is paged out at the time of a Slurm upgrade
(changing plugins). Controlled via new LaunchParameters options of
slurmstepd_memlock and slurmstepd_memlock_all.
-- Add event trigger on burst buffer errors (see strigger man page,
--burst_buffer option).
-- Add job AdminComment field which can only be set by a Slurm administrator.
-- Add salloc, sbatch and srun option of --delay-boot=<time>, which will
temporarily delay booting nodes into the desired state for a job in the
hope of using nodes already in the proper state which will be available at
a later time.
-- Add job burst_buffer_state and delay_boot fields to scontrol and squeue
output. Also add ability to modify delay_boot from scontrol.
-- Fix for node's available TRES array getting filled in with configured GRES
model types.
-- Log if job --bb option contains any unrecognized content.
-- Display configured and allocated TRES for nodes in scontrol show nodes.
-- Change all memory values (in MB) to uint64_t to accommodate > 2TB per node.
-- Add MailDomain configuration parameter to qualify email addresses.
-- Refactor the persistent connections within the federation code to use
the same logic that was found in the slurmdbd. Now both functionalities
share the same code.
-- Remove BlueGene/L and BlueGene/P support.
-- Add "flag" field to launch_tasks_request_msg. Remove the following fields
(moved into flags): multi_prog, task_flags, user_managed_io, pty,
buffered_stdio, and labelio.
-- Add protocol version to slurmd startup communications for slurmstepd to
permit changes in the protocol.
* Changes in Slurm 17.02.0pre1
==============================
-- burst_buffer/cray - Add support for rounding up the size of a buffer reqeust
if the DataWarp configuration "equalize_fragments" is used.
-- Remove AIX support.
-- Rename "in" to "input" in slurm_step_io_fds data structure defined in
slurm.h. This is needed to avoid breaking Python with by using one of its
keywords in a Slurm data structure.
-- Remove eligible_time from jobcomp/elasticsearch.
-- Enable the deletion of a QOS, even if no clusters have been added to the
database.
-- SlurmDBD - change all timestamps to bigint from int to solve Y2038 problem.
-- Add salloc/sbatch/srun --spread-job option to distribute tasks over as many
nodes as possible. This also treats the --ntasks-per-node option as a
maximum value.
-- Add ConstrainKmemSpace to cgroup.conf, defaulting to yes, to allow
cgroup Kmem enforcement to be disabled while still using ConstrainRAMSpace.
-- Add support for sbatch --bbf option to specify a burst buffer input file.
-- Added burst buffer support for job arrays. Add new SchedulerParameters
configuration parameter of bb_array_stage_cnt=# to indicate how many pending
tasks of a job array should be made available for burst buffer resource
allocation.
-- Fix small memory leak when a job fails to load from state save.
-- Fix invalid read when attempting to delete clusters from database with
running jobs.
-- Fix small memory leak when deleting clusters from database.
-- Add SLURM_ARRAY_TASK_COUNT environment variable. Total number of tasks in a
job array (e.g. "--array=2,4,8" will set SLURM_ARRAY_TASK_COUNT=3).
-- Add new sacctmgr commands: "shutdown" (shutdown the server), "list stats"
(get server statistics) "clear stats" (clear server statistics).
-- Restructure job accounting query to use 'id_job in (1, 2, .. )' format
instead of logically equivalent 'id_job = 1 || id_job = 2 || ..' .
-- Added start_delay field to jobcomp/elasticsearch.
-- In order to support federated jobs, the MaxJobID configuration parameter
default value has been reduced from 2,147,418,112 to 67,043,328 and its
maximum value is now 67,108,863. Upon upgrading, any pre-existing jobs that
have a job ID above the new range will continue to run and new jobs will get
job IDs in the new range.
-- Added infrastructure for setting up federations in database and establishing
connections between federation clusters.
* Changes in Slurm 16.05.12
===========================
* Changes in Slurm 16.05.11
===========================
-- burst_buffer/cray - Add support for line continuation.
-- If a job is cancelled by the user while it's allocated nodes are being
reconfigured (i.e. the capmc_resume program is rebooting nodes for the job)
and the node reconfiguration fails (i.e. the reboot fails), then don't
requeue the job but leave it in a cancelled state.
-- capmc_resume (Cray resume node script) - Do not disable changing a node's
active features if SyscfgPath is configured in the knl.conf file.
-- Fix memory error when updating a job's licenses.
-- Fix double read lock of tres when updating gres or licenses on a job.
-- Fix regression in 16.05.10 with respects to GrpTresMins on a QOS or
Association.
-- ALPS - Fix scheduling when ALPS doesn't agree with Slurm on what nodes
are free.
-- Fix seg fault if loading attempting to load non-existent burstbuffer plugin.
-- Fix to backfill scheduling with respect to QOS and association limits. Jobs
submitted to multiple partitions are most likley to be effected.
-- Avoid erroneous errno set by the mariadb 10.2 api.
-- Fix security issue in Prolog and Epilog by always prepending SPANK_ to
all user-set environment variables. CVE-2017-15566.
* Changes in Slurm 16.05.10-2
=============================
-- Replace clock_gettime with time(NULL) for very old systems without the call.
* Changes in Slurm 16.05.10
===========================
-- Record job state as PREEMPTED instead of TIMEOUT when GraceTime is reached.
-- task/cgroup - print warnings to stderr when --cpu_bind=verbose is enabled
and the requested processor affinity cannot be set.
-- power/cray - Disable power cap get and set operations on DOWN nodes.
-- Jobs preempted with PreemptMode=REQUEUE were incorrectly recorded as
REQUEUED in the accounting.
-- PMIX - Use volatile specifier to avoid flag caching and lock the flag to
make sure it is protected.
-- PMIX/PMI2 - Make it possible to use %n or %h in a spool dir.
-- burst_buffer/cray - Support default pool which is not the first pool
reported by DataWarp and log in Slurm when pools that are added or removed
from DataWarp.
-- Insure job does not start running before PrologSlurmctld is complete and
node is booted (all nodes for interactive job, at least first node for batch
job without burst buffers).
-- Fix minor memory leak in the slurmctld when removing a QOS.
-- burst_buffer/cray - Do not execute "pre_run" operation until after all nodes
are booted and ready for use.
-- scontrol - return an error when attempting to use the +=/-+ syntax to
update a field where this is not appropriate.
-- Fix task/affinity to work correctly with --ntasks-per-socket.
-- Honor --ntasks-per-node and --ntasks option when used with job constraints
that contain node counts.
-- Prevent deadlocked slurmstepd processes due to unsafe use of regcomp with
older glibc versions.
-- Fix squeue when SLURM_BITSTR_LEN=0 is set in the user environment.
-- Fix comments in acct_policy.c to reflect actual variables instead of
old ones.
-- Fix correct variables when validating GrpTresMins on a QOS.
-- Better debug output when a job is being held because of a GrpTRES[Run]Min
limits.
-- Fix correct state reason when job can't run 'safely' because of an
association GrpWall limit.
-- Squeue always loads new data if user_id option specified
-- Fix for possible job ID parsing failure and abort.
-- If node boot in progress when slurmctld daemon is restarted, then allow
sufficient time for reboot to complete and not prematurely DOWN the node as
"Not responding".
-- For job resize, correct logic to build "resize" script with new values.
Previously the scripts were based upon the original job size.
-- Fix squeue to not limit the size of partition, burst_buffer, exec_host, or
reason to 32 chars.
-- Fix potential packing error when packing a NULL slurmdb_clus_res_rec_t.
-- Fix potential packing errors when packing a NULL slurmdb_reservation_cond_t.
-- Burst_buffer/cray - Prevent slurmctld daemon abort if "paths" operation
fails. Now job will be held. Update job update time when held.
-- Fix issues with QOS flags Partition[Min|Max]Nodes to work correctly.
-- Increase number of ResumePrograms that can be managed without leaving
zombie/orphan processes from 10 to 100.
-- Refactor slurmctld agent logic to eliminate some pthreads.
* Changes in Slurm 16.05.9
==========================
-- Fix parsing of SBCAST_COMPRESS environment variable in sbcast.
-- Change some debug messages to errors in task/cgroup plugin.
-- backfill scheduler: Stop trying to determine expected start time for a job
after 2 seconds of wall time. This can happen if there are many running jobs
and a pending job can not be started soon.
-- Improve performance of cr_sort_part_rows() in cons_res plugin.
-- CRAY - Fix dealock issue when updating accounting in the slurmctld and
scheduling a Datawarp job.
-- Correct the job state accounting information for jobs requeued due to burst
buffer errors.
-- burst_buffer/cray - Avoid "pre_run" operation if not using buffer (i.e.
just creating or deleting a persistent burst buffer).
-- Fix slurm.spec file support for BlueGene builds.
-- Fix missing TRES read lock in acct_policy_job_runnable_pre_select() code.
-- Fix debug2 message printing value using wrong array index in
_qos_job_runnable_post_select().
-- Prevent job timeout on node power up.
-- MYSQL - Fix minor memory leak when querying steps and the sql fails.
-- Make it so sacctmgr accepts column headers like MaxTRESPU and not MaxTRESP.
-- Only look at SLURM_STEP_KILLED_MSG_NODE_ID on startup, to avoid race
condition later when looking at a steps env.
-- Make backfill scheduler behave like regular scheduler in respect to
'assoc_limit_stop'.
-- Allow a lower version client command to talk to a higher version contoller
using the multi-cluster options (e.g. squeue -M<cluster>).
-- slurmctld/agent race condition fix: Prevent job launch while PrologSlurmctld
daemon is running or node boot in progress.
-- MYSQL - Fix a few other minor memory leaks when uncommon failures occur.
-- burst_buffer/cray - Fix race condition that could cause multiple batch job
launch requests resulting in drained nodes.
-- Correct logic to purge old reservations.
-- Fix DBD cache restore from previous versions.
-- Fix to logic for getting expected start time of existing job ID with
explicit begin time that is in the past.
-- Clear job's reason of "BeginTime" in a more timely fashion and/or prevents
them from being stuck in a PENDING state.
-- Make sure acct policy limits imposed on a job are correct after requeue.
* Changes in Slurm 16.05.8
==========================
-- Remove StoragePass from being printed out in the slurmdbd log at debug2
level.
-- Defer PATH search for task program until launch in slurmstepd.
-- Modify regression test1.89 to avoid leaving vestigial job. Also reduce
logging to reduce likelyhood of Expect buffer overflow.
-- Do not PATH search for mult-prog launches if LaunchParamters=test_exec is
enabled.
-- Fix for possible infinite loop in select/cons_res plugin when trying to
satisfy a job's ntasks_per_core or socket specification.
-- If job is held for bad constraints make it so once updated the job doesn't
go into JobAdminHeld.
-- sched/backfill - Fix logic to reserve resources for jobs that require a
node reboot (i.e. to change KNL mode) in order to start.
-- When unpacking a node or front_end record from state and the protocol
version is lower than the min version, set it to the min.
-- Remove redundant lookup for part_ptr when updating a reservation's nodes.
-- Fix memory and file descriptor leaks in slurmd daemon's sbcast logic.
-- Do not allocate specialized cores to jobs using the --exclusive option.
-- Cancel interactive job if Prolog failure with "PrologFlags=contain" or
"PrologFlags=alloc" configured. Send new error prolog failure message to
the salloc or srun command as needed.
-- Prevent possible out-of-bounds read in slurmstepd on an invalid #! line.
-- Fix check for PluginDir within slurmctld to work with multiple directories.
-- Cancel interactive jobs automatically on communication error to launching
srun/salloc process.
-- Fix security issue caused by insecure file path handling triggered by the
failure of a Prolog script. To exploit this a user needs to anticipate or
cause the Prolog to fail for their job. CVE-2016-10030.
* Changes in Slurm 16.05.7
==========================
-- Fix issue in the priority/multifactor plugin where on a slurmctld restart,
where more time is accounted for than should be allowed.
-- cray/busrt_buffer - If total_space in a pool decreases, reset used_space
rather than trying to account for buffer allocations in progress.
-- cray/busrt_buffer - Fix for double counting of used_space at slurmctld
startup.
-- Fix regression in 16.05.6 where if you request multiple cpus per task (-c2)
and request --ntasks-per-core=1 and only 1 task on the node
the slurmd would abort on an infinite loop fatal.
-- cray/busrt_buffer - Internally track both allocated and unusable space.
The reported UsedSpace in a pool is now the allocated space (previously was
unusable space). Base available space on whichever value leaves least free
space.
-- cray/burst_buffer - Preserve job ID and don't translate to job array ID.
-- cray/burst_buffer - Update "instance" parsing to match updated dw_wlm_cli
output.
-- sched/backfill - Insure we don't try to start a job that was already started
and requeued by the main scheduling logic.
-- job_submit/lua - add access to the job features field in job_record.
-- select/linear plugin modified to better support heterogeneous clusters when
topology/none is also configured.
-- Permit cancellation of jobs in configuring state.
-- acct_gather_energy/rapl - prevent segfault in slurmd from race to gather
data at slurmd startup.
-- Integrate node_feature/knl_generic with "hbm" GRES information.
-- Fix output routines to prevent rounding the TRES values for memory or BB.
-- switch/cray plugin - fix use after free error.
-- docs - elaborate on how way to clear TRES limits in sacctmgr.
-- knl_cray plugin - Avoid abort from backup slurmctld at start time.
-- cgroup plugins - fix two minor memory leaks.
-- If a node is booting for some job, don't allocate additional jobs to the
node until the boot completes.
-- testsuite - fix job id output in test17.39.
-- Modify backfill algorithm to improve performance with large numbers of
running jobs. Group running jobs that end in a "similar" time frame using a
time window that grows exponentially rather than linearly. After one second
of wall time, simulate the termination of all remaining running jobs in
order to respond in a reasonable time frame.
-- Fix slurm_job_cpus_allocated_str_on_node_id() API call.
-- sched/backfill plugin: Make malloc match data type (defined as uint32_t and
allocated as int).
-- srun - prevent segfault when terminating job step before step has launched.
-- sacctmgr - prevent segfault when trying to reset usage for an invalid
account name.
-- Make the openssl crypto plugin compile with openssl >= 1.1.
-- Fix SuspendExcNodes and SuspendExcParts on slurmctld reconfiguration.
-- sbcast - prevent segfault in slurmd due to race condition between file
transfers from separate jobs using zlib compression
-- cray/burst_buffer - Increase time to synchronize operations between threads
from 5 to 60 seconds ("setup" operation time observed over 17 seconds).
-- node_features/knl_cray - Fix possible race condition when changing node
state that could result in old KNL mode as an active features.
-- Make sure if a job can't run because of resources we also check accounting
limits after the node selection to make sure it doesn't violate those limits
and if it does change the reason for waiting so we don't reserve resources
on jobs violating accounting limits.
-- NRT - Make it so a system running against IBM's PE will work with PE
version 1.3.
-- NRT - Make it so protocols pgas and test are allowed to be used.
-- NRT - Make it so you can have more than 1 protocol listed in MP_MSG_API.
-- cray/burst_buffer - If slurmctld daemon restarts with pending job and burst
buffer having unknown file stage-in status, teardown the buffer, defer the
job, and start stage-in over again.
-- On state restore in the slurmctld don't overwrite the mem_spec_limit given
from the slurm.conf when using FastSchedule=0.
-- Recognize a KNL's proper NUMA count (rather than setting it to the value
in slurm.conf) when using FastSchedule=0.
-- Fix parsing in regression test1.92 for some prompts.
-- sbcast - use slurmd's gid cache rather than a separate lookup.
-- slurmd - return error if setgroups() call fails in _drop_privileges().
-- Remove error messages about gres counts changing when a job is resized on
a slurmctld restart or reconfig, as they aren't really error messages.
-- Fix possible memory corruption if a job is using GRES and changing size.
-- jobcomp/elasticsearch - fix printf format for a value on 32-bit builds.
-- task/cgroup - Change error message if CPU binding can not take place to
better identify the root cause of the problem.
-- Fix issue where task/cgroup would not always honor --cpu_bind=threads.
-- Fix race condition in with getgrouplist() in slurmd that can lead to
user accounts being granted access to incorrect group memberships during
job launch.
* Changes in Slurm 16.05.6
==========================
-- Docs - the correct default value for GroupUpdateForce is 0.
-- mpi/pmix - improve point to point communication performance.
-- SlurmDB - include pending jobs in search during 'sacctmgr show runawayjobs'.
-- Add client side out-of-range checks to --nice flag.
-- Fix support for sbatch "-W" option, previously eeded to use "--wait".
-- node_features/knl_cray plugin and capmc_suspend/resume programs modified to
sleep and retry capmc operations if the Cray State Manager is down. Added
CapmcRetries configuration parameter to knl_cray.conf.
-- node_features/knl_cray plugin: Remove any KNL MCDRAM or NUMA features from
node's configuration if capmc does NOT report the node as being KNL.
-- node_features/knl_cray plugin: drain any node not reported by
"capmc node_status" on startup or reconfig.
-- node_features/knl_cray plugin: Substantially streamline and speed up logic
to load current node state on reconfigure failure or unexpected node boot.
-- node_features/knl_cray plugin: Add separate thread to interact with capmc
in response to unexpected node reboots.
-- node_features plugin - Add "mode" argument to node_features_p_node_xlate()
function to fix some bugs updating a node's features using the node update
RPC.
-- node_features/knl_cray plugin: If the reconfiguration of nodes for an
interactive job fails, kill the job (it can't be requeued like a batch job).
-- Testsuite - Added srun/salloc/sbatch tests with --use-min-nodes option.
-- Fix typo when an error occurs when discovering pmix version on
configure.
-- Fix configuring pmix support when you have your lib dir symlinked to lib64.
-- Fix waiting reason if a job is waiting for a specific limit instead of
always just AccountingPolicy.
-- Correct SchedulerParameters=bf_busy_nodes logic with respect to the job's
minimum node count. Previous logic would not decremement counter in some
locations and reject valid job request for not reaching minimum node count.
-- Fix FreeBSD-11 build by using llabs() function in place of abs().
-- Cray: The slurmd can manipulate the socket/core/thread values reported based
upon the configuration. The logic failed to consider select/cray with
SelectTypeParameters=other_cons_res as equivalent to select/cons_res.
-- If a node's socket or core count are changed at registration time (e.g. a
KNL node's NUMA mode is changed), change it's board count to match.
-- Prevent possible divide by zero in select/cons_res if a node's board count
is higher than it's socket count.
-- Allow an advanced reservation to contain a license count of zero.
-- Preserve non-KNL node features when updating the KNL node features for a
multi-node job in which the non-KNL node features vary by node.
-- task/affinity plugin: Honor a job's --ntasks-per-socket and
--ntasks-per-core options in task binding.
-- slurmd - do not print ClusterName when using 'slurmd -C'.
-- Correct a bitmap test function (used only by the select/bluegene plugin).
-- Do not propagate SLURM_UMASK environment variable to batch script.
-- Added node_features/knl_generic plugin for KNL support on non-Cray systems.
-- Cray: Prevent abort in backfill scheduling logic for requeued job that has
been cancelled while NHC is running.
-- Improve reported estimates of start and end times for pending jobs.
-- pbsnodes: Show OS value as "unknown" for down nodes.
-- BlueGene - correctly scale node counts when enforcing MaxNodes limit take 2.
-- Fix "sbatch --hold" to set JobHeldUser correctly instead of JobHeldAdmin.
-- Cray - print warning that task/cgroup is required, and must be after
task/cray in the TaskPlugin settings.
-- Document that node Weight takes precedence over load with LLN scheduling.
-- Fix issue where gang scheduling could happen even with OverSubscribe=NO.
-- Expose JOB_SHARED_* values to job_submit/lua plugin.
-- Fix issue where number of nodes is not properly allocated when srun is
requested with -n tasks < hosts from -w hostlist.
-- Update srun documentation for -N, -w and -m arbitrary.
-- Fix bug that was clearing MAINT mode on nodes scheduled for reboot (bug
introduced in version 16.05.5 to address bug in overlapping reservations).
-- Add logging of node reboot requests.
-- Docs - remove recommendation for ReleaseAgent setting in cgroup.conf.
-- Make sure a job cleans up completely if it has a node fail. Mostly an
issue with gang scheduling.
* Changes in Slurm 16.05.5
==========================
-- Fix accounting for jobs requeued after the previous job was finished.
-- slurmstepd modified to pre-load all relevant plugins at startup to avoid
the possibility of modified plugins later resulting in inconsistent API
or data structures and a failure of slurmstepd.
-- Export functions from parse_time.c in libslurm.so.
-- Export unit convert functions from slurm_protocol_api.c in libslurm.so.
-- Fix scancel to allow multiple steps from a job to be cancelled at once.
-- Update and expand upgrade guide (in Quick Start Administrator web page).
-- burst_buffer/cray: Requeue, but do not hold a job which fails the pre_run
operation.
-- Insure reported expected job start time is not in the past for pending jobs.
-- Add support for PMIx v2.
-- mpi/pmix: support for passing TMPDIR path through info key
-- Cray: update slurmconfgen_smw.py script to correctly identify service nodes
versus compute nodes.
-- FreeBSD - fix build issue in knl_cray plugin.
-- Corrections to gres.conf parsing logic.
-- Make partition State independent of EnforcePartLimits value.
-- Fix multipart srun submission with EnforcePartLimits=NO and job violating
the partition limits.
-- Fix problem updating job state_reason.
-- pmix - Provide HWLOC topology in the job-data if Slurm was configured
with hwloc.
-- Cray - Fix issue restoring jobs when blade count increases due to hardware
reconfiguration.
-- burst_buffer/cray - Hold job after 3 failed pre-run operations.
-- sched/backfill - Check that a user's QOS is allowed to use a partition
before trying to schedule resources on that partition for the job.
-- sacctmgr - Fix displaying nodenames when printing out events or
reservations.
-- Fix mpiexec wrapper to accept task count with more than one digit.
-- Add mpiexec man page to the script.
-- Add salloc_wait_nodes option to the SchedulerParameters parameter in the
slurm.conf file controlling when the salloc command returns in relation to
when nodes are ready for use (i.e. booted).
-- Handle case when slurmctld daemon restart while compute node reboot in
progress. Return node to service rather than setting DOWN.
-- Preserve node "RESERVATION" state when one of multiple overlapping
reservations ends.
-- Restructure srun command locking for task_exit processing logic for improved
parallelism.
-- Modify srun task completion handling to only build the task/node string for
logging purposes if it is needed. Modified for performance purposes.
-- Docs - update salloc/sbatch/srun man pages to mention corresponding
environment variables for --mem/--mem-per-cpu and allowed suffixes.
-- Silence srun warning when overriding the job ntasks-per-node count
with a lower task count for the step.
-- Docs - assorted spelling fixes.
-- node_features/knl_cray: Fix bug where MCDRAM state could be taken from
capmc rather than cnselect.
-- node_features/knl_cray: If a node is rebooted outside of Slurm's direction,
update it's active features with current MCDRAM and NUMA mode information.
-- Restore ability to manually power down nodes, broken in 15.08.12.
-- Don't log error for job end_time being zero if node health check is still
running.
-- When powering up a node to change it's state (e.g. KNL NUMA or MCDRAM mode)
then pass to the ResumeProgram the job ID assigned to the nodes in the
SLURM_JOB_ID environment variable.
-- Allow a node's PowerUp state flag to be cleared using update_node RPC.
-- capmc_suspend/resume - If a request modify NUMA or MCDRAM state on a set of
nodes or reboot a set of nodes fails then just requeue the job and abort the
entire operation rather than trying to operate on individual nodes.
-- node_features/knl_cray plugin: Increase default CapmcTimeout parameter from
10 to 60 seconds.
-- Fix squeue filter by job license when a job has requested more than 1
license of a certain type.
-- Fix bug in PMIX_Ring in the pmi2 plugin so that it supports singleton mode.
It also updates the testpmixring.c test program so it can be used to check
singleton runs.
-- Automically clean up task/cgroup cpuset and devices cgroups after steps are
completed.
-- Testsuite - Fix test1.83 to handle gaps in node names properly.
-- BlueGene - correctly scale node counts when enforcing MaxNodes limit.
-- Make sure no attempt is made to schedule a requeued job until all steps are
cleaned (Node Health Check completes for all steps on a Cray).
-- KNL: Correct task affinity logic for some NUMA modes.
-- Add salloc/sbatch/srun --priority option of "TOP" to set job priority to
the highest possible value. This option is only available to Slurm operators
and administrators.
-- Add salloc/sbatch/srun option --use-min-nodes to prefer smaller node counts
when a range of node counts is specified (e.g. "-N 2-4").
-- Validate salloc/sbatch --wait-all-nodes argument.
-- Add "sbatch_wait_nodes" to SchedulerParameters to control default sbatch
behaviour with respect to waiting for all allocated nodes to be ready for
use. Job can override the configuration option using the --wait-all-nodes=#
option.
-- Prevent partition group access updates from resetting last_part_update when
no changes have been made. Prevents backfill scheduler from restarting
mid-cycle unnecessarily.
-- Cray - add NHC_ABSOLUTELY_NO to never run NHC, even on certain edge cases
that it would otherwise be run on with NHC_NO.
-- Ignore GRES/QOS updates that maintain the same value as before.
-- mpi/pmix - prepare temp directory for application.
-- Fix display for the nice and priority values in sprio/scontrol/squeue.
* Changes in Slurm 16.05.4
==========================
-- Fix potential deadlock if running with message aggregation.
-- Streamline when schedule() is called when running with message aggregation
on batch script completes.
-- Fix incorrect casting when [un]packing derived_ec on slurmdb_job_rec_t.
-- Document that persistent burst buffers can not be created or destroyed using
the salloc or srun --bb options.
-- Add support for setting the SLURM_JOB_ACCOUNT, SLURM_JOB_QOS and
SLURM_JOB_RESERVAION environment variables are set for the salloc command.
Document the same environment variables for the salloc, sbatch and srun
commands in their man pages.
-- Fix issue where sacctmgr load cluster.cfg wouldn't load associations
that had a partition in them.
-- Don't return the extern step from sstat by default.
-- In sstat print 'extern' instead of 4294967295 for the extern step.
-- Make advanced reservations work properly with core specialization.
-- Fix race condition in the account_gather plugin that could result in job
stuck in COMPLETING state.
-- Regression test fixes if SelectTypePlugin not managing memory and no node
memory size set (defaults to 1 MB per node).
-- Add missing partition write locks to _slurm_rpc_dump_nodes/node_single to
prevent a race condition leading to inconsistent sinfo results.
-- Fix task:CPU binding logic for some processors. This bug was introduced
in version 16.05.1 to address KNL bunding problem.
-- Fix two minor memory leaks in slurmctld.
-- Improve partition-specific limit logging from slurmctld daemon.
-- Fix incorrect access check when using MaxNodes setting on the partition.
-- Fix issue with sacctmgr when specifying a list of clusters to query.
-- Fix issue when calculating future StartTime for a job.
-- Make EnforcePartLimit support logic work with any ordering of partitions
in job submit request.
-- Prevent restoration of wrong CPU governor and frequency when using
multiple task plugins.
-- Prevent slurmd abort if hwloc library fails to populate the "children"
arrays (observed with hwloc version "dev-333-g85ea6e4").
-- burst_buffer/cray: Add "--groupid" to DataWarp "setup" command.
-- Fix lustre profiling putting it in the Filesystem dataset instead of the
Network dataset.
-- Fix profiling documentation and code to match be consistent with
Filesystem instead of Lustre.
-- Correct the way watts is calculated in the rapl plugin when using a poll
frequency other than AcctGatherNodeFreq.
-- Don't about step launch if job reaches expected end time while node is
configuring/booting (NOTE: The job end time will be adjusted after node
becomes ready for use).
-- Fix several print routines to respect a custom output delimiter when
printing NO_VAL or INFINITE.
-- Correct documented configurations where --ntasks-per-core and
--ntasks-per-socket are supported.
-- task/affinity plugin buffer allocated too small, can corrupt memory.
* Changes in Slurm 16.05.3
==========================
-- Make it so the extern step uses a reverse tree when cleaning up.
-- If extern step doesn't get added into the proctrack plugin make sure the
sleep is killed.
-- Fix areas the slurmctld can segfault if an extern step is in the system
cleaning up on a restart.
-- Prevent possible incorrect counting of GRES of a given type if a node has
the multiple "types" of a given GRES "name", which could over-subscribe
GRES of a given type.
-- Add web links to Slurm Diamond Collectors (from Harvard University) and
collectd (from EDF).
-- Add job_submit plugin for the "reboot" field.
-- Make some more Slurm constants (INFINITE, NO_VAL64, etc.) available to
job_submit/lua plugins.
-- Send in a -1 for a taskid into spank_task_post_fork for the extern_step.
-- MYSQL - Sightly better logic if a job completion comes in with an end time
of 0.
-- task/cgroup plugin is configured with ConstrainRAMSpace=yes, then set soft
memory limit to allocated memory limit (previously no soft limit was set).
-- Document limitations in burst buffer use by the salloc command (possible
access problems from a login node).
-- Fix proctrack plugin to only add the pid of a process once
(regression in 16.05.2).
-- Fix for sstat to print correct info when requesting jobid.batch as part of
a comma-separated list.
-- CRAY - Fix issue if pid has already been added to another job container.
-- CRAY - Fix add of extern step to AELD.
-- burstbufer/cray: avoid batch submit error condition if waiting for stagein.
-- CRAY - Fix for reporting steps lingering after they are already finished.
-- Testsuite - fix test1.29 / 17.15 for limits with values above 32-bits.
-- CRAY - Simplify when a NHC is called on a step that has unkillable
processes.
-- CRAY - If trying to kill a step and you have NHC_NO_STEPS set run NHC
anyway to attempt to log the backtraces of the potential
unkillable processes.
-- Fix gang scheduling and license release logic if single node job killed on
bad node.
-- Make scontrol show steps show the extern step correctly.
-- Do not scheduled powered down nodes in FAILED state.
-- Do not start slurmctld power_save thread until partition information is read
in order to prevent race condition that can result invalid pointer when
trying to resolve configured SuspendExcParts.
-- Add SLURM_PENDING_STEP id so it won't be confused with SLURM_EXTERN_CONT.
-- Fix for core selection with job --gres-flags=enforce-binding option.
Previous logic would in some cases allocate a job zero cores, resulting in
slurmctld abort.
-- Minimize preempted jobs for configurations with multiple jobs per node.
-- Improve partition AllowGroups caching. Update the table of UIDs permitted to
use a partition based upon it's AllowGroups configuration parameter as new
valid UIDs are found rather than looking up that user's group information
for every job they submit. If the user is now allowed to use the partition,
then do not check that user's group access again for 5 seconds.
-- Add routing queue information to Slurm FAQ web page.
-- Do not select_g_step_finish() a SLURM_PENDING_STEP step, as nothing has
been allocated for the step yet.
-- Fixed race condition in PMIx Fence logic.
-- Prevent slurmctld abort if job is killed or requeued while waiting for
reboot of its allocated compute nodes.
-- Treat invalid user ID in AllowUserBoot option of knl.conf file as error
rather than fatal (log and do not exit).
-- qsub - When doing the default output files for an array in qsub style
make them using the master job ID instead of the normal job ID.
-- Create the extern step while creating the job instead of waiting until the
end of the job to do it.
-- Always report a 0 exit code for the extern step instead of being canceled
or failed based on the signal that would always be killing it.
-- Fix to allow users to update QOS of pending jobs.
-- CRAY - Fix minor memory leak in switch plugin.
-- CRAY - Change slurmconfgen_smw.py to skip over disabled nodes.
-- Fix eligible_time for elasticsearch as well as add queue_wait
(difference between start of job and when it was eligible).
* Changes in Slurm 16.05.2
==========================
-- CRAY - Fix issue where the proctrack plugin could hang if the container
id wasn't able to be made.
-- Move test for job wait reason value of BurstBufferResources and
BurstBufferStageIn later in the scheduling logic.
-- Document which srun options apply to only job, only step, or job and step
allocations.
-- Use more compatible function to get thread name (>= 2.6.11).
-- Fix order of job then step id when noting cleaning flag being set.
-- Make it so the extern step sends a message with accounting information
back to the slurmctld.
-- Make it so the extern step calls the select_g_step_start|finish functions.
-- Don't print error when extern step is canceled because job is ending.
-- Handle a few error codes when dealing with the extern step to make sure
we have the pids added to the system correctly.
-- Add support for job dependencies with job array expressions. Previous logic
required listing each task of job array individually.
-- Make sure tres_cnt is set before creating a slurmdb_assoc_usage_t.
-- Prevent backfill scheduler from starting a second "singleton" job if another
one started during a backfill sleep.
-- Fix for invalid array pointer when creating advanced reservation when job
allocations span heterogeneous nodes (differing core or socket counts).
-- Fix hostlist_ranged_string_xmalloc_dims to correctly not put brackets on
hostlists when brackets == 0.
-- Make sure we don't get brackets when making a range of reserved ports
for a step.
-- Change fatal to an error if port ranges aren't correct when reading state
for steps.
* Changes in Slurm 16.05.1
==========================
-- Fix __cplusplus macro in spank.h to allow compilation with C++.
-- Fix compile issue with older glibc < 2.12
-- Fix for starting batch step with mpi/pmix plugin.
-- Fix for "scontrol -dd show job" with respect to displaying the specific
CPUs allocated to a job on each node. Prior logic would only display
the CPU information for the first node in the job allocation.
-- Print correct return code on failure to update active node features
through sview.
-- Allow QOS timelimit to override partition timelimit when EnforcePartLimits
is set to all/any.
-- Make it so qsub will do a "basename" on a wrapped command for the output
and error files.
-- Fix issue where slurmd could core when running the ipmi energy plugin.
-- Documentation - clean up typos.
-- Add logic so that slurmstepd can be launched under valgrind.
-- Increase buffer size to read /proc/*/stat files.
-- Fix for tracking job resource allocation when slurmctld is reconfigured
while Cray Node Health Check (NHC) is running. Previous logic would fail to
record the job's allocation then perform release operation upon NHC
completion, resulting in underflow error messages.
-- Make "scontrol show daemons" work with long node names.
-- CRAY - Collect energy using a uint64_t instead of uint32_t.
-- Fix incorrect if statements when determining if the user has a default
account or wckey.
-- Prevent job stuck in configuring state if slurmctld daemon restarted while
PrologSlurmctld is running. Also re-issue burst_buffer/pre-load operation
as needed.
-- Correct task affinity support for FreeBSD.
-- Fix for task affinity on KNL in SNC2/Flat mode.
-- Recalculate a job's memory allocation after node reboot if job requests all
of a node's memory and FastSchedule=0 is configured. Intel KNL memory size
can change on reboot with various MCDRAM modes.
-- Fix small memory leak when printing HealthCheckNodeState.
-- Eliminate memory leaks when AuthInfo is configured.
-- Improve sdiag output description in man page.
-- Cray/capmc_resume script modify a node's features (as needed) when the
reinit (reboot) command is issued rather than wait for the nodes to change
to the "on" state.
-- Correctly print ranges when using step values in job arrays.
-- Allow from file names / paths over 256 characters when launching steps,
as well as spaces in the executable name.
-- job_submit.license.lua example modified to send message back to user.
-- Document job --mem=0 option means all memory on a node.
-- Set SLURM_JOB_QOS environment variable to QOS name instead of description.
-- knl_cray.conf file option of CnselectPath added.
-- node_features/knl_cray plugin modified to get current node NUMA and MCDRAM
modes using cnselect command rather than capmc command.
-- liblua - add SLES12 paths to runtime search list.
-- Fix qsub default output and error files for task arrays.
-- Fix qsub to set job_name correctly when wrapping a script (-b y)
-- Cray - set EnforcePartLimits=any in slurm.conf template.
* Changes in Slurm 16.05.0
==========================
-- Update seff to fix warnings with ncpus, and list slurm-perlapi dependency
in spec file.
-- Fix testsuite to consistent use /usr/bin/env {bash,expect} construct.
-- Cray - Ensure that step completion messages get to the database.
-- Fix step cpus_per_task calculation for heterogeneous job allocation.
-- Fix --with-json= configure option to use specified path.
-- Add back thread_id to "thread_id" LogTimeFormat to distinguish between
mutliple threads with the same name. Now displays thread name and id.
-- Change how Slurm determines the NUMA count of a node. Ignore KNL NUMA
that only include memory.
-- Cray - Fix node list parsing in capmc_suspend/resume programs.
-- Fix sbatch #BSUB parsing for -W and -M options.
-- Fix GRES task layout bug that could cause slurmctld to abort.
-- Fix to --gres-flags=enforce-binding logic when multiple sockets needed.