OpenMPI Release Information

The following is reproduced essentially verbatim from files contained within the OpenMPI tarball downloaded from https://www.open-mpi.org.

Copyright (c) 2004-2010 The Trustees of Indiana University and Indiana
                        University Research and Technology
                        Corporation.  All rights reserved.
Copyright (c) 2004-2006 The University of Tennessee and The University
                        of Tennessee Research Foundation.  All rights
                        reserved.
Copyright (c) 2004-2008 High Performance Computing Center Stuttgart,
                        University of Stuttgart.  All rights reserved.
Copyright (c) 2004-2006 The Regents of the University of California.
                        All rights reserved.
Copyright (c) 2006-2017 Cisco Systems, Inc.  All rights reserved.
Copyright (c) 2006      Voltaire, Inc. All rights reserved.
Copyright (c) 2006      Sun Microsystems, Inc.  All rights reserved.
                        Use is subject to license terms.
Copyright (c) 2006-2017 Los Alamos National Security, LLC.  All rights
                        reserved.
Copyright (c) 2010-2017 IBM Corporation.  All rights reserved.
Copyright (c) 2012      Oak Ridge National Labs.  All rights reserved.
Copyright (c) 2012-2017 Sandia National Laboratories.  All rights reserved.
Copyright (c) 2012      University of Houston. All rights reserved.
Copyright (c) 2013      NVIDIA Corporation.  All rights reserved.
Copyright (c) 2013-2017 Intel, Inc. All rights reserved.
Copyright (c) 2017      Research Organization for Information Science
                        and Technology (RIST). All rights reserved.
Copyright (c) 2018      Amazon.com, Inc. or its affiliates.  All Rights
                        reserved.

Additional copyrights may follow.

As more fully described in the "Software Version Number" section in
the README file, Open MPI typically releases two separate version
series simultaneously.  Since these series have different goals and
are semi-independent of each other, a single NEWS-worthy item may be
introduced into different series at different times.  For example,
feature F was introduced in the vA.B series at version vA.B.C, and was
later introduced into the vX.Y series at vX.Y.Z.

The first time feature F is released, the item will be listed in the
vA.B.C section, denoted as:

   (** also to appear: X.Y.Z) -- indicating that this item is also
                                 likely to be included in future release
                                 version vX.Y.Z.

When vX.Y.Z is later released, the same NEWS-worthy item will also be
included in the vX.Y.Z section and be denoted as:

   (** also appeared: A.B.C)  -- indicating that this item was previously
                                 included in release version vA.B.C.

4.0.1 -- March, 2019
--------------------

- Update embedded PMIx to 3.1.2.
- Fix an issue with Vader (shared-memory) transport on OS-X. Thanks
  to Daniel Vollmer for reporting.
- Fix a problem with the usNIC BTL Makefile.  Thanks to George Marselis
  for reporting.
- Fix an issue when using --enable-visibility configure option
  and older versions of hwloc.  Thanks to Ben Menadue for reporting
  and providing a fix.
- Fix an issue with MPI_WIN_CREATE_DYNAMIC and MPI_GET from self.
  thanks to Bart Janssens for reporting.
- Fix an issue of excessive compiler warning messages from mpi.h
  when using newer C++ compilers.  Thanks to @Shadow-fax for
  reporting.
- Fix a problem when building Open MPI using clang 5.0.
- Fix a problem with MPI_WIN_CREATE when using UCX.  Thanks
  to Adam Simpson for reporting.
- Fix a memory leak encountered for certain MPI datatype
  destructor operations.  Thanks to Axel Huebl for reporting.
- Fix several problems with MPI RMA accumulate operations.
  Thanks to Jeff Hammond for reporting.
- Fix possible race condition in closing some file descriptors
  during job launch using mpirun.  Thanks to Jason Williams
  for reporting and providing a fix.
- Fix a problem in OMPIO for large individual write operations.
  Thanks to Axel Huebl for reporting.
- Fix a problem with parsing of map-by ppr options to mpirun.
  Thanks to David Rich for reporting.
- Fix a problem observed when using the mpool hugepage component.  Thanks
  to Hunter Easterday for reporting and fixing.
- Fix valgrind warning generated when invoking certain MPI Fortran
  data type creation functions.  Thanks to @rtoijala for reporting.
- Fix a problem when trying to build with a PMIX 3.1 or newer
  release.  Thanks to Alastair McKinstry for reporting.
- Fix a problem encountered with building MPI F08 module files.
  Thanks to Igor Andriyash and Axel Huebl for reporting.
- Fix two memory leaks encountered for certain MPI-RMA usage patterns.
  Thanks to Joseph Schuchart for reporting and fixing.
- Fix a problem with the ORTE rmaps_base_oversubscribe MCA paramater.
  Thanks to @iassiour for reporting.
- Fix a problem with UCX PML default error handler for MPI communicators.
  Thanks to Marcin Krotkiewski for reporting.
- Fix various issues with OMPIO uncovered by the testmpio test suite.

4.0.0 -- September, 2018
------------------------

- OSHMEM updated to the OpenSHMEM 1.4 API.
- Do not build OpenSHMEM layer when there are no SPMLs available.
  Currently, this means the OpenSHMEM layer will only build if
  a MXM or UCX library is found.
- A UCX BTL was added for enhanced MPI RMA support using UCX
- With this release,  OpenIB BTL now only supports iWarp and RoCE by default.
- Updated internal HWLOC to 2.0.2
- Updated internal PMIx to 3.0.2
- Change the priority for selecting external verses internal HWLOC
  and PMIx packages to build.  Starting with this release, configure
  by default selects available external HWLOC and PMIx packages over
  the internal ones.
- Updated internal ROMIO to 3.2.1.
- Removed support for the MXM MTL.
- Removed support for SCIF.
- Improved CUDA support when using UCX.
- Enable use of CUDA allocated buffers for OMPIO.
- Improved support for two phase MPI I/O operations when using OMPIO.
- Added support for Software-based Performance Counters, see
  https://github.com/davideberius/ompi/wiki/
          How-to-Use-Software-Based-Performance-Counters-(SPCs)-in-Open-MPI
- Change MTL OFI from opting-IN on "psm,psm2,gni" to opting-OUT on
  "shm,sockets,tcp,udp,rstream"
- Various improvements to MPI RMA performance when using RDMA
  capable interconnects.
- Update memkind component to use the memkind 1.6 public API.
- Fix a problem with javadoc builds using OpenJDK 11.  Thanks to
  Siegmar Gross for reporting.
- Fix a memory leak using UCX.  Thanks to Charles Taylor for reporting.
- Fix hangs in MPI_FINALIZE when using UCX.
- Fix a problem with building Open MPI using an external PMIx 2.1.2
  library.  Thanks to Marcin Krotkiewski for reporting.
- Fix race conditions in Vader (shared memory) transport.
- Fix problems with use of newer map-by mpirun options.  Thanks to
  Tony Reina for reporting.
- Fix rank-by algorithms to properly rank by object and span
- Allow for running as root of two environment variables are set.
  Requested by Axel Huebl.
- Fix a problem with building the Java bindings when using Java 10.
  Thanks to Bryce Glover for reporting.
- Fix a problem with ORTE not reporting error messages if an application
  terminated normally but exited with non-zero error code.  Thanks to
  Emre Brookes for reporting.

3.1.3 -- October, 2018
----------------------

- Fix race condition in MPI_THREAD_MULTIPLE support of non-blocking
  send/receive path.
- Fix error handling SIGCHLD forwarding.
- Add support for CHARACTER and LOGICAL Fortran datatypes for MPI_SIZEOF.
- Fix compile error when using OpenJDK 11 to compile the Java bindings.
- Fix crash when using a hostfile with a 'user@host' line.
- Numerous Fortran '08 interface fixes.
- TCP BTL error message fixes.
- OFI MTL now will use any provider other than shm, sockets, tcp, udp, or
  rstream, rather than only supporting gni, psm, and psm2.
- Disable async receive of CUDA buffers by default, fixing a hang
  on large transfers.
- Support the BCM57XXX and BCM58XXX Broadcomm adapters.
- Fix minmax datatype support in ROMIO.
- Bug fixes in vader shared memory transport.
- Support very large buffers with MPI_TYPE_VECTOR.
- Fix hang when launching with mpirun on Cray systems.

3.1.2 -- August, 2018
---------------------

- A subtle race condition bug was discovered in the "vader" BTL
  (shared memory communications) that, in rare instances, can cause
  MPI processes to crash or incorrectly classify (or effectively drop)
  an MPI message sent via shared memory.  If you are using the "ob1"
  PML with "vader" for shared memory communication (note that vader is
  the default for shared memory communication with ob1), you need to
  upgrade to v3.1.2 or later to fix this issue.  You may also upgrade
  to the following versions to fix this issue:
  - Open MPI v2.1.5 (expected end of August, 2018) or later in the
    v2.1.x series
  - Open MPI v3.0.1 (released March, 2018) or later in the v3.0.x
    series
- Assorted Portals 4.0 bug fixes.
- Fix for possible data corruption in MPI_BSEND.
- Move shared memory file for vader btl into /dev/shm on Linux.
- Fix for MPI_ISCATTER/MPI_ISCATTERV Fortran interfaces with MPI_IN_PLACE.
- Upgrade PMIx to v2.1.3.
- Numerous One-sided bug fixes.
- Fix for race condition in uGNI BTL.
- Improve handling of large number of interfaces with TCP BTL.
- Numerous UCX bug fixes.

3.1.1 -- June, 2018
-------------------

- Fix potential hang in UCX PML during MPI_FINALIZE
- Update internal PMIx to v2.1.2rc2 to fix forward version compatibility.
- Add new MCA parameter osc_sm_backing_store to allow users to specify
  where in the filesystem the backing file for the shared memory
  one-sided component should live.  Defaults to /dev/shm on Linux.
- Fix potential hang on non-x86 platforms when using builds with
  optimization flags turned off.
- Disable osc/pt2pt when using MPI_THREAD_MULTIPLE due to numerous
  race conditions in the component.
- Fix dummy variable names for the mpi and mpi_f08 Fortran bindings to
  match the MPI standard.  This may break applications which use
  name-based parameters in Fortran which used our internal names
  rather than those documented in the MPI standard.
- Revamp Java detection to properly handle new Java versions which do
  not provide a javah wrapper.
- Fix RMA function signatures for use-mpi-f08 bindings to have the
  asynchonous property on all buffers.
- Improved configure logic for finding the UCX library.

3.1.0 -- May, 2018
------------------

- Various OpenSHMEM bug fixes.
- Properly handle array_of_commands argument to Fortran version of
  MPI_COMM_SPAWN_MULTIPLE.
- Fix bug with MODE_SEQUENTIAL and the sharedfp MPI-IO component.
- Use "javac -h" instead of "javah" when building the Java bindings
  with a recent version of Java.
- Fix mis-handling of jostepid under SLURM that could cause problems
  with PathScale/OmniPath NICs.
- Disable the POWER 7/BE block in configure.  Note that POWER 7/BE is
  still not a supported platform, but it is no longer automatically
  disabled.  See
  https://github.com/open-mpi/ompi/issues/4349#issuecomment-374970982
  for more information.
- The output-filename option for mpirun is now converted to an
  absolute path before being passed to other nodes.
- Add monitoring component for PML, OSC, and COLL to track data
  movement of MPI applications.  See
  ompi/mca/commmon/monitoring/HowTo_pml_monitoring.tex for more
  information about the monitoring framework.
- Add support for communicator assertions: mpi_assert_no_any_tag,
  mpi_assert_no_any_source, mpi_assert_exact_length, and
  mpi_assert_allow_overtaking.
- Update PMIx to version 2.1.1.
- Update hwloc to 1.11.7.
- Many one-sided behavior fixes.
- Improved performance for Reduce and Allreduce using Rabenseifner's algorithm.
- Revamped mpirun --help output to make it a bit more manageable.
- Portals4 MTL improvements: Fix race condition in rendezvous protocol and
  retry logic.
- UCX OSC: initial implementation.
- UCX PML improvements: add multi-threading support.
- Yalla PML improvements: Fix error with irregular contiguous datatypes.
- Openib BTL: disable XRC support by default.
- TCP BTL: Add check to detect and ignore connections from processes
  that aren't MPI (such as IDS probes) and verify that source and
  destination are using the same version of Open MPI, fix issue with very
  large message transfer.
- ompi_info parsable output now escapes double quotes in values, and
  also quotes values can contains colons.  Thanks to Lev Givon for the
  suggestion.
- CUDA-aware support can now handle GPUs within a node that do not
  support CUDA IPC.  Earlier versions would get error and abort.
- Add a mca parameter ras_base_launch_orted_on_hn to allow for launching
  MPI processes on the same node where mpirun is executing using a separate
  orte daemon, rather than the mpirun process.   This may be useful to set to
  true when using SLURM, as it improves interoperability with SLURM's signal
  propagation tools.  By default it is set to false, except for Cray XC systems.
- Remove LoadLeveler RAS support.
- Remove IB XRC support from the OpenIB BTL due to lack of support.
- Add functionality for IBM s390 platforms.  Note that regular
  regression testing does not occur on the s390 and it is not
  considered a supported platform.
- Remove support for big endian PowerPC.
- Remove support for XL compilers older than v13.1.
- Remove support for atomic operations using MacOS atomics library.

3.0.2 -- June, 2018
-------------------

- Disable osc/pt2pt when using MPI_THREAD_MULTIPLE due to numerous
  race conditions in the component.
- Fix dummy variable names for the mpi and mpi_f08 Fortran bindings to
  match the MPI standard.  This may break applications which use
  name-based parameters in Fortran which used our internal names
  rather than those documented in the MPI standard.
- Fixed MPI_SIZEOF in the "mpi" Fortran module for the NAG compiler.
- Fix RMA function signatures for use-mpi-f08 bindings to have the
  asynchonous property on all buffers.
- Fix Fortran MPI_COMM_SPAWN_MULTIPLE to properly follow the count
  length argument when parsing the array_of_commands variable.
- Revamp Java detection to properly handle new Java versions which do
  not provide a javah wrapper.
- Improved configure logic for finding the UCX library.
- Add support for HDR InfiniBand link speeds.
- Disable the POWER 7/BE block in configure.  Note that POWER 7/BE is
  still not a supported platform, but it is no longer automatically
  disabled.  See
  https://github.com/open-mpi/ompi/issues/4349#issuecomment-374970982
  for more information.

3.0.1 -- March, 2018
--------------------

- Fix ability to attach parallel debuggers to MPI processes.
- Fix a number of issues in MPI I/O found by the HDF5 test suite.
- Fix (extremely) large message transfers with shared memory.
- Fix out of sequence bug in multi-NIC configurations.
- Fix stdin redirection bug that could result in lost input.
- Disable the LSF launcher if CSM is detected.
- Plug a memory leak in MPI_Mem_free().  Thanks to Philip Blakely for reporting.
- Fix the tree spawn operation when the number of nodes is larger than the radix.
  Thanks to Carlos Eduardo de Andrade for reporting.
- Fix Fortran 2008 macro in MPI extensions.  Thanks to Nathan T. Weeks for
  reporting.
- Add UCX to list of interfaces that OpenSHMEM will use by default.
- Add --{enable|disable}-show-load-errors-by-default to control
  default behavior of the load errors option.
- OFI MTL improvements: handle empty completion queues properly, fix
  incorrect error message around fi_getinfo(), use default progress
  option for provider by default, Add support for reading multiple
  CQ events in ofi_progress.
- PSM2 MTL improvements: Allow use of GPU buffers, thread fixes.
- Numerous corrections to memchecker behavior.
- Add a mca parameter ras_base_launch_orted_on_hn to allow for launching
  MPI processes on the same node where mpirun is executing using a separate
  orte daemon, rather than the mpirun process.   This may be useful to set to
  true when using SLURM, as it improves interoperability with SLURM's signal
  propagation tools.  By default it is set to false, except for Cray XC systems.
- Fix a problem reported on the mailing separately by Kevin McGrattan and Stephen
  Guzik about consistency issues on NFS file systems when using OMPIO. This fix
  also introduces a new mca parameter fs_ufs_lock_algorithm which allows to
  control the locking algorithm used by ompio for read/write operations. By
  default, ompio does not perfom locking on local UNIX file systems, locks the
  entire file per operation on NFS file systems, and selective byte-range
  locking on other distributed file systems.
- Add an mca parameter pmix_server_usock_connections to allow mpirun to
  support applications statically built against the Open MPI v2.x release,
  or installed in a container along with the Open MPI v2.x libraries. It is
  set to false by default.

3.0.0 -- September, 2017
------------------------

Major new features:

- Use UCX allocator for OSHMEM symmetric heap allocations to optimize intra-node
  data transfers.  UCX SPML only.
- Use UCX multi-threaded API in the UCX PML.  Requires UCX 1.0 or later.
- Added support for Flux PMI
- Update embedded PMIx to version 2.1.0
- Update embedded hwloc to version 1.11.7

Changes in behavior compared to prior versions:

- Per Open MPI's versioning scheme (see the README), increasing the
  major version number to 3 indicates that this version is not
  ABI-compatible with prior versions of Open MPI. In addition, there may
  be differences in MCA parameter names and defaults from previous releases.
  Command line options for mpirun and other commands may also differ from
  previous versions. You will need to recompile MPI and OpenSHMEM applications
  to work with this version of Open MPI.
- With this release, Open MPI supports MPI_THREAD_MULTIPLE by default.
- New configure options have been added to specify the locations of libnl
  and zlib.
- A new configure option has been added to request Flux PMI support.
- The help menu for mpirun and related commands is now context based.
  "mpirun --help compatibility" generates the help menu in the same format
  as previous releases.

Removed legacy support:
- AIX is no longer supported.
- Loadlever is no longer supported.
- OpenSHMEM currently supports the UCX and MXM transports via the ucx and ikrit
  SPMLs respectively.
- Remove IB XRC support from the OpenIB BTL due to lack of support.
- Remove support for big endian PowerPC.
- Remove support for XL compilers older than v13.1

Known issues:

- MPI_Connect/accept between applications started by different mpirun
  commands will fail, even if ompi-server is running.

2.1.5 -- August 2018
--------------------

- A subtle race condition bug was discovered in the "vader" BTL
  (shared memory communications) that, in rare instances, can cause
  MPI processes to crash or incorrectly classify (or effectively drop)
  an MPI message sent via shared memory.  If you are using the "ob1"
  PML with "vader" for shared memory communication (note that vader is
  the default for shared memory communication with ob1), you need to
  upgrade to v2.1.5 to fix this issue.  You may also upgrade to the
  following versions to fix this issue:
  - Open MPI v3.0.1 (released March, 2018) or later in the v3.0.x
    series
  - Open MPI v3.1.2 (expected end of August, 2018) or later
- A link issue was fixed when the UCX library was not located in the
  linker-default search paths.

2.1.4 -- August, 2018
---------------------

Bug fixes/minor improvements:
- Disable the POWER 7/BE block in configure.  Note that POWER 7/BE is
  still not a supported platform, but it is no longer automatically
  disabled.  See
  https://github.com/open-mpi/ompi/issues/4349#issuecomment-374970982
  for more information.
- Fix bug with request-based one-sided MPI operations when using the
  "rdma" component.
- Fix issue with large data structure in the TCP BTL causing problems
  in some environments.  Thanks to @lgarithm for reporting the issue.
- Minor Cygwin build fixes.
- Minor fixes for the openib BTL:
  - Support for the QLogic RoCE HCA
  - Support for the Boradcom Cumulus RoCE HCA
  - Enable support for HDR link speeds
- Fix MPI_FINALIZED hang if invoked from an attribute destructor
  during the MPI_COMM_SELF destruction in MPI_FINALIZE.  Thanks to
  @AndrewGaspar for reporting the issue.
- Java fixes:
  - Modernize Java framework detection, especially on OS X/MacOS.
    Thanks to Bryce Glover for reporting and submitting the fixes.
  - Prefer "javac -h" to "javah" to support newer Java frameworks.
- Fortran fixes:
  - Use conformant dummy parameter names for Fortran bindings.  Thanks
    to Themos Tsikas for reporting and submitting the fixes.
  - Build the MPI_SIZEOF() interfaces in the "TKR"-style "mpi" module
    whenever possible.  Thanks to Themos Tsikas for reporting the
    issue.
  - Fix array of argv handling for the Fortran bindings of
    MPI_COMM_SPAWN_MULTIPLE (and its associated man page).
  - Make NAG Fortran compiler support more robust in configure.
- Disable the "pt2pt" one-sided MPI component when MPI_THREAD_MULTIPLE
  is used.  This component is simply not safe in MPI_THREAD_MULTIPLE
  scenarios, and will not be fixed in the v2.1.x series.
- Make the "external" hwloc component fail gracefully if it is tries
  to use an hwloc v2.x.y installation.  hwloc v2.x.y will not be
  supported in the Open MPI v2.1.x series.
- Fix "vader" shared memory support for messages larger than 2GB.
  Thanks to Heiko Bauke for the bug report.
- Configure fixes for external PMI directory detection.  Thanks to
  Davide Vanzo for the report.

2.1.3 -- March, 2018
--------------------

Bug fixes/minor improvements:
- Update internal PMIx version to 1.2.5.
- Fix a problem with ompi_info reporting using param option.
  Thanks to Alexander Pozdneev for reporting.
- Correct PMPI_Aint_{add|diff} to be functions (not subroutines)
  in the Fortran mpi_f08 module.
- Fix a problem when doing MPI I/O using data types with large
  extents in conjunction with MPI_TYPE_CREATE_SUBARRAY.  Thanks to
  Christopher Brady for reporting.
- Fix a problem when opening many files using MPI_FILE_OPEN.
  Thanks to William Dawson for reporting.
- Fix a problem with debuggers failing to attach to a running job.
  Thanks to Dirk Schubert for reporting.
- Fix a problem when using madvise and the OpenIB BTL.  Thanks to
  Timo Bingmann for reporting.
- Fix a problem in the Vader BTL that resulted in failures of
  IMB under certain circumstances.  Thanks to Nicolas Morey-
  Chaisemartin for reporting.
- Fix a problem preventing Open MPI from working under Cygwin.
  Thanks to Marco Atzeri for reporting.
- Reduce some verbosity being emitted by the USNIC BTL under certain
  circumstances.  Thanks to Peter Forai for reporting.
- Fix a problem with misdirection of SIGKILL.  Thanks to Michael Fern
  for reporting.
- Replace use of posix_memalign with malloc for small allocations.  Thanks
  to Ben Menaude for reporting.
- Fix a problem with Open MPI's out of band TCP network for file descriptors
  greater than 32767.  Thanks to Wojtek Wasko for reporting and fixing.
- Plug a memory leak in MPI_Mem_free().  Thanks to Philip Blakely for reporting.

2.1.2 -- September, 2017
------------------------

Bug fixes/minor improvements:
- Update internal PMIx version to 1.2.3.
- Fix some problems when using the NAG Fortran compiler to build Open MPI
  and when using the compiler wrappers.  Thanks to Neil Carlson for reporting.
- Fix a compilation problem with the SM BTL.  Thanks to Paul Hargrove for
  reporting.
- Fix a problem with MPI_IALLTOALLW when using zero-length messages.
  Thanks to Dahai Guo for reporting.
- Fix a problem with C11 generic type interface for SHMEM_G.  Thanks
  to Nick Park for reporting.
- Switch to using the lustreapi.h include file when building Open MPI
  with Lustre support.
- Fix a problem in the OB1 PML that led to hangs with OSU collective tests.
- Fix a progression issue with MPI_WIN_FLUSH_LOCAL.  Thanks to
  Joseph Schuchart for reporting.
- Fix an issue with recent versions of PBSPro requiring libcrypto.
  Thanks to Petr Hanousek for reporting.
- Fix a problem when using MPI_ANY_SOURCE with MPI_SENDRECV.
- Fix an issue that prevented signals from being propagated to ORTE
  daemons.
- Ensure that signals are forwarded from ORTE daemons to all processes
  in the process group created by the daemons.  Thanks to Ted Sussman
  for reporting.
- Fix a problem with launching a job under a debugger. Thanks to
  Greg Lee for reporting.
- Fix a problem with Open MPI native I/O MPI_FILE_OPEN when using
  a communicator having an associated topology.  Thanks to
  Wei-keng Liao for reporting.
- Fix an issue when using MPI_ACCUMULATE with derived datatypes.
- Fix a problem with Fortran bindings that led to compilation errors
  for user defined reduction operations.  Thanks to Nathan Weeks for
  reporting.
- Fix ROMIO issues with large writes/reads when using NFS file systems.
- Fix definition of Fortran MPI_ARGV_NULL and MPI_ARGVS_NULL.
- Enable use of the head node of a SLURM allocation on Cray XC systems.
- Fix a problem with synchronous sends when using the UCX PML.
- Use default socket buffer size to improve TCP BTL performance.
- Add a mca parameter ras_base_launch_orted_on_hn to allow for launching
  MPI processes on the same node where mpirun is executing using a separate
  orte daemon, rather than the mpirun process.   This may be useful to set to
  true when using SLURM, as it improves interoperability with SLURM's signal
  propagation tools.  By default it is set to false, except for Cray XC systems.
- Fix --without-lsf when lsf is installed in the default search path.
- Remove support for big endian PowerPC.
- Remove support for XL compilers older than v13.1
- Remove IB XRC support from the OpenIB BTL due to loss of maintainer.

2.1.1 -- April, 2017
--------------------

Bug fixes/minor improvements:

- Fix a problem with one of Open MPI's fifo data structures which led to
  hangs in a make check test.  Thanks to Nicolas Morey-Chaisemartin for
  reporting.
- Add missing MPI_AINT_ADD/MPI_AINT_DIFF function definitions to mpif.h.
  Thanks to Aboorva Devarajan for reporting.
- Fix the error return from MPI_WIN_LOCK when rank argument is invalid.
  Thanks to Jeff Hammond for reporting and fixing this issue.
- Fix a problem with mpirun/orterun when started under a debugger. Thanks
  to Gregory Leff for reporting.
- Add configury option to disable use of CMA by the vader BTL.  Thanks
  to Sascha Hunold for reporting.
- Add configury check for MPI_DOUBLE_COMPLEX datatype support.
  Thanks to Alexander Klein for reporting.
- Fix memory allocated by MPI_WIN_ALLOCATE_SHARED to
  be 64 bit aligned.  Thanks to Joseph Schuchart for
  reporting.
- Update MPI_WTICK man page to reflect possibly higher
  resolution than 10e-6.  Thanks to Mark Dixon for
  reporting
- Add missing MPI_T_PVAR_SESSION_NULL definition to mpi.h
  include file.  Thanks to Omri Mor for this contribution.
- Enhance the Open MPI spec file to install modulefile in /opt
  if installed in a non-default location.  Thanks to Kevin
  Buckley for reporting and supplying a fix.
- Fix a problem with conflicting PMI symbols when linking statically.
  Thanks to Kilian Cavalotti for reporting.

Known issues (to be addressed in v2.1.2):

- See the list of fixes slated for v2.1.2 here:
  https://github.com/open-mpi/ompi/milestone/28

2.1.0 -- March, 2017
--------------------

Major new features:

- The main focus of the Open MPI v2.1.0 release was to update to PMIx
  v1.2.1.  When using PMIx (e.g., via mpirun-based launches, or via
  direct launches with recent versions of popular resource managers),
  launch time scalability is improved, and the run time memory
  footprint is greatly decreased when launching large numbers of MPI /
  OpenSHMEM processes.
- Update OpenSHMEM API conformance to v1.3.
- The usnic BTL now supports MPI_THREAD_MULTIPLE.
- General/overall performance improvements to MPI_THREAD_MULTIPLE.
- Add a summary message at the bottom of configure that tells you many
  of the configuration options specified and/or discovered by Open
  MPI.

Changes in behavior compared to prior versions:

- None.

Removed legacy support:

- The ptmalloc2 hooks have been removed from the Open MPI code base.
  This is not really a user-noticable change; it is only mentioned
  here because there was much rejoycing in the Open MPI developer
  community.

Bug fixes/minor improvements:

- New MCA parameters:
  - iof_base_redirect_app_stderr_to_stdout: as its name implies, it
    combines MPI / OpenSHMEM applications' stderr into its stdout
    stream.
  - opal_event_include: allow the user to specify which FD selection
    mechanism is used by the underlying event engine.
  - opal_stacktrace_output: indicate where stacktraces should be sent
    upon MPI / OpenSHMEM process crashes ("none", "stdout", "stderr",
    "file:filename").
  - orte_timeout_for_stack_trace: number of seconds to wait for stack
    traces to be reported (or <=0 to wait forever).
  - mtl_ofi_control_prog_type/mtl_ofi_data_prog_type: specify libfabric
    progress model to be used for control and data.
- Fix MPI_WTICK regression where the time reported may be inaccurate
  on systems with processor frequency scalaing enabled.
- Fix regression that lowered the memory maximum message bandwidth for
  large messages on some BTL network transports, such as openib, sm,
  and vader.
- Fix a name collision in the shared file pointer MPI IO file locking
  scheme.  Thanks to Nicolas Joly for reporting the issue.
- Fix datatype extent/offset errors in MPI_PUT and MPI_RACCUMULATE
  when using the Portals 4 one-sided component.
- Add support for non-contiguous datatypes to the Portals 4 one-sided
  component.
- Various updates for the UCX PML.
- Updates to the following man pages:
  - mpirun(1)
  - MPI_COMM_CONNECT(3)
  - MPI_WIN_GET_NAME(3). Thanks to Nicolas Joly for reporting the
    typo.
  - MPI_INFO_GET_[NKEYS|NTHKEY](3). Thanks to Nicolas Joly for
    reporting the typo.
- Fixed a problem in the TCP BTL when using MPI_THREAD_MULTIPLE.
  Thanks to Evgueni Petrov for reporting.
- Fixed external32 representation in the romio314 module.  Note that
  for now, external32 representation is not correctly supported by the
  ompio module.  Thanks to Thomas Gastine for bringing this to our
  attention.
- Add note how to disable a warning message about when a high-speed
  MPI transport is not found.  Thanks to Susan Schwarz for reporting
  the issue.
- Ensure that sending SIGINT when using the rsh/ssh launcher does not
  orphan children nodes in the launch tree.
- Fix the help message when showing deprecated MCA param names to show
  the correct (i.e., deprecated) name.
- Enable support for the openib BTL to use multiple different
  InfiniBand subnets.
- Fix a minor error in MPI_AINT_DIFF.
- Fix bugs with MPI_IN_PLACE handling in:
  - MPI_ALLGATHER[V]
  - MPI_[I][GATHER|SCATTER][V]
  - MPI_IREDUCE[_SCATTER]
  - Thanks to all the users who helped diagnose these issues.
- Allow qrsh to tree spawn (if the back-end system supports it).
- Fix MPI_T_PVAR_GET_INDEX to return the correct index.
- Correctly position the shared file pointer in append mode in the
  OMPIO component.
- Add some deprecated names into shmem.h for backwards compatibility
  with legacy codes.
- Fix MPI_MODE_NOCHECK support.
- Fix a regression in PowerPC atomics support.  Thanks to Orion
  Poplawski for reporting the issue.
- Fixes for assembly code with aggressively-optimized compilers on
  x86_64/AMD64 platforms.
- Fix one more place where configure was mangling custom CFLAGS.
  Thanks to Phil Tooley (@Telemin) for reporting the issue.
- Better handle builds with external installations of hwloc.
- Fixed a hang with MPI_PUT and MPI_WIN_LOCK_ALL.
- Fixed a bug when using MPI_GET on non-contiguous datatypes and
  MPI_LOCK/MPI_UNLOCK.
- Fixed a bug when using POST/START/COMPLETE/WAIT after a fence.
- Fix configure portability by cleaning up a few uses of "==" with
  "test".  Thanks to Kevin Buckley for pointing out the issue.
- Fix bug when using darrays with lib and extent of darray datatypes.
- Updates to make Open MPI binary builds more bit-for-bit
  reproducable.  Thanks to Alastair McKinstry for the suggestion.
- Fix issues regarding persistent request handling.
- Ensure that shmemx.h is a standalone OpenSHMEM header file.  Thanks
  to Nick Park (@nspark) for the report.
- Ensure that we always send SIGTERM prior to SIGKILL.  Thanks to Noel
  Rycroft for the report.
- Added ConnectX-5 and Chelsio T6 device defaults for the openib BTL.
- OpenSHMEM no longer supports MXM less than v2.0.
- Plug a memory leak in ompi_osc_sm_free.  Thanks to Joseph Schuchart
  for the report.
- The "self" BTL now uses less memory.
- The vader BTL is now more efficient in terms of memory usage when
  using XPMEM.
- Removed the --enable-openib-failover configure option.  This is not
  considered backwards-incompatible because this option was stale and
  had long-since stopped working, anyway.
- Allow jobs launched under Cray aprun to use hyperthreads if
  opal_hwloc_base_hwthreads_as_cpus MCA parameter is set.
- Add support for 32-bit and floating point Cray Aries atomic
  operations.
- Add support for network AMOs for MPI_ACCUMULATE, MPI_FETCH_AND_OP,
  and MPI_COMPARE_AND_SWAP if the "ompi_single_intrinsic" info key is
  set on the window or the "acc_single_intrinsic" MCA param is set.
- Automatically disqualify RDMA CM support in the openib BTL if
  MPI_THREAD_MULTIPLE is used.
- Make configure smarter/better about auto-detecting Linux CMA
  support.
- Improve the scalability of MPI_COMM_SPLIT_TYPE.
- Fix the mixing of C99 and C++ header files with the MPI C++
  bindings.  Thanks to Alastair McKinstry for the bug report.
- Add support for ARM v8.
- Several MCA parameters now directly support MPI_T enumerator
  semantics (i.e., they accept a limited set of values -- e.g., MCA
  parameters that accept boolean values).
- Added --with-libmpi-name=STRING configure option for vendor releases
  of Open MPI.  See the README for more detail.
- Fix a problem with Open MPI's internal memory checker.  Thanks to Yvan
  Fournier for reporting.
- Fix a multi-threaded issue with MPI_WAIT.  Thanks to Pascal Deveze for
  reporting.

Known issues (to be addressed in v2.1.1):

- See the list of fixes slated for v2.1.1 here:
  https://github.com/open-mpi/ompi/milestone/26

2.0.4 -- November, 2017
-----------------------

Bug fixes/minor improvements:
- Fix an issue with visibility of functions defined in the built-in PMIx.
  Thanks to Siegmar Gross for reporting this issue.
- Add configure check to prevent trying to build this release of
  Open MPI with an external hwloc 2.0 or newer release.
- Add ability to specify layered providers for OFI MTL.
- Fix a correctness issue with Open MPI's memory manager code
  that could result in corrupted message data.  Thanks to
  Valentin Petrov for reporting.
- Fix issues encountered when using newer versions of PBS Pro.
  Thanks to Petr Hanousek for reporting.
- Fix a problem with MPI_GET when using the vader BTL.  Thanks
  to Dahai Guo for reporting.
- Fix a problem when using MPI_ANY_SOURCE with MPI_SENDRECV_REPLACE.
  Thanks to Dahai Guo for reporting.
- Fix a problem using MPI_FILE_OPEN with a communicator with an
  attached cartesian topology.  Thanks to Wei-keng Liao for reporting.
- Remove IB XRC support from the OpenIB BTL due to lack of support.
- Remove support for big endian PowerPC.
- Remove support for XL compilers older than v13.1

2.0.3 -- June 2017
------------------

Bug fixes/minor improvements:

 - Fix a problem with MPI_IALLTOALLW when zero size messages are present.
   Thanks to @mathbird for reporting.
 - Add missing MPI_USER_FUNCTION definition to the mpi_f08 module.
   Thanks to Nathan Weeks for reporting this issue.
 - Fix a problem with MPI_WIN_LOCK not returning an error code when
   a negative rank is supplied.  Thanks to Jeff Hammond for reporting and
   providing a fix.
 - Fix a problem with make check that could lead to hangs.  Thanks to
   Nicolas Morey-Chaisemartin for reporting.
 - Resolve a symbol conflict problem with PMI-1 and PMI-2 PMIx components.
   Thanks to Kilian Cavalotti for reporting this issue.
 - Insure that memory allocations returned from MPI_WIN_ALLOCATE_SHARED are
   64 byte aligned.  Thanks to Joseph Schuchart for reporting this issue.
 - Make use of DOUBLE_COMPLEX, if available, for Fortran bindings.  Thanks
   to Alexander Klein for reporting this issue.
 - Add missing MPI_T_PVAR_SESSION_NULL definition to Open MPI mpi.h include
   file.  Thanks to Omri Mor for reporting and fixing.
 - Fix a problem with use of MPI shared file pointers when accessing
   a file from independent jobs.  Thanks to Nicolas Joly for reporting
   this issue.
 - Optimize zero size MPI_IALLTOALL{V,W} with MPI_IN_PLACE.  Thanks to
   Lisandro Dalcin for the report.
 - Fix a ROMIO buffer overflow problem for large transfers when using NFS
   filesystems.
 - Fix type of MPI_ARGV[S]_NULL which prevented it from being used
   properly with MPI_COMM_SPAWN[_MULTIPLE] in the mpi_f08 module.
 - Ensure to add proper linker flags to the wrapper compilers for
   dynamic libraries on platforms that need it (e.g., RHEL 7.3 and
   later).
 - Get better performance on TCP-based networks 10Gbps and higher by
   using OS defaults for buffer sizing.
 - Fix a bug with MPI_[R][GET_]ACCUMULATE when using DARRAY datatypes.
 - Fix handling of --with-lustre configure command line argument.
   Thanks to Prentice Bisbal and Tim Mattox for reporting the issue.
 - Added MPI_AINT_ADD and MPI_AINT_DIFF declarations to mpif.h.  Thanks
   to Aboorva Devarajan (@AboorvaDevarajan) for the bug report.
 - Fix a problem in the TCP BTL when Open MPI is initialized with
   MPI_THREAD_MULTIPLE support.  Thanks to Evgueni Petro for analyzing and
   reporting this issue.
 - Fix yalla PML to properly handle underflow errors, and fixed a
   memory leak with blocking non-contiguous sends.
 - Restored ability to run autogen.pl on official distribution tarballs
   (although this is still not recommended for most users!).
 - Fix accuracy problems with MPI_WTIME on some systems by always using
   either clock_gettime(3) or gettimeofday(3).
 - Fix a problem where MPI_WTICK was not returning a higher time resolution
   when available.  Thanks to Mark Dixon for reporting this issue.
 - Restore SGE functionality.  Thanks to Kevin Buckley for the initial
   report.
 - Fix external hwloc compilation issues, and extend support to allow
   using external hwloc installations as far back as v1.5.0.  Thanks to
   Orion Poplawski for raising the issue.
 - Added latest Mellanox Connect-X and Chelsio T-6 adapter part IDs to
   the openib list of default values.
 - Do a better job of cleaning up session directories (e.g., in /tmp).
 - Update a help message to indicate how to suppress a warning about
   no high performance networks being detected by Open MPI.  Thanks to
   Susan Schwarz for reporting this issue.
 - Fix a problem with mangling of custom CFLAGS when configuring Open MPI.
   Thanks to Phil Tooley for reporting.
 - Fix some minor memory leaks and remove some unused variables.
   Thanks to Joshua Gerrard for reporting.
 - Fix MPI_ALLGATHERV bug with MPI_IN_PLACE.

Known issues (to be addressed in v2.0.4):

- See the list of fixes slated for v2.0.4 here:
  https://github.com/open-mpi/ompi/milestone/29

2.0.2 -- 26 January 2017
------------------------

Bug fixes/minor improvements:

- Fix a problem with MPI_FILE_WRITE_SHARED when using MPI_MODE_APPEND and
  Open MPI's native MPI-IO implementation.  Thanks to Nicolas Joly for
  reporting.
- Fix a typo in the MPI_WIN_GET_NAME man page.  Thanks to Nicolas Joly
  for reporting.
- Fix a race condition with ORTE's session directory setup.  Thanks to
  @tbj900 for reporting this issue.
- Fix a deadlock issue arising from Open MPI's approach to catching calls to
  munmap. Thanks to Paul Hargrove for reporting and helping to analyze this
  problem.
- Fix a problem with PPC atomics which caused make check to fail unless builtin
  atomics configure option was enabled.  Thanks to Orion Poplawski for reporting.
- Fix a problem with use of x86_64 cpuid instruction which led to segmentation
  faults when Open MPI was configured with -O3 optimization.  Thanks to Mark
  Santcroos for reporting this problem.
- Fix a problem when using built in atomics configure options on PPC platforms
  when building 32 bit applications.  Thanks to Paul Hargrove for reporting.
- Fix a problem with building Open MPI against an external hwloc installation.
  Thanks to Orion Poplawski for reporting this issue.
- Remove use of DATE in the message queue version string reported to debuggers to
  insure bit-wise reproducibility of binaries.  Thanks to Alastair McKinstry
  for help in fixing this problem.
- Fix a problem with early exit of a MPI process without calling MPI_FINALIZE
  or MPI_ABORT that could lead to job hangs.  Thanks to Christof Koehler for
  reporting.
- Fix a problem with forwarding of SIGTERM signal from mpirun to MPI processes
  in a job.  Thanks to Noel Rycroft for reporting this problem
- Plug some memory leaks in MPI_WIN_FREE discovered using Valgrind.  Thanks
  to Joseph Schuchart for reporting.
- Fix a problems  MPI_NEIGHOR_ALLTOALL when using a communicator with an empty topology
  graph.  Thanks to Daniel Ibanez for reporting.
- Fix a typo in a PMIx component help file.  Thanks to @njoly for reporting this.
- Fix a problem with Valgrind false positives when using Open MPI's internal memchecker.
  Thanks to Yvan Fournier for reporting.
- Fix a problem with MPI_FILE_DELETE returning MPI_SUCCESS when
  deleting a non-existent file. Thanks to Wei-keng Liao for reporting.
- Fix a problem with MPI_IMPROBE that could lead to hangs in subsequent MPI
  point to point or collective calls.  Thanks to Chris Pattison for reporting.
- Fix a problem when configure Open MPI for powerpc with --enable-mpi-cxx
  enabled.  Thanks to Alastair McKinstry for reporting.
- Fix a problem using MPI_IALLTOALL with MPI_IN_PLACE argument.  Thanks to
  Chris Ward for reporting.
- Fix a problem using MPI_RACCUMULATE with the Portals4 transport.  Thanks to
  @PDeveze for reporting.
- Fix an issue with static linking and duplicate symbols arising from PMIx
  Slurm components.  Thanks to Limin Gu for reporting.
- Fix a problem when using MPI dynamic memory windows.  Thanks to
  Christoph Niethammer for reporting.
- Fix a problem with Open MPI's pkgconfig files.  Thanks to Alastair McKinstry
  for reporting.
- Fix a problem with MPI_IREDUCE when the same buffer is supplied for the
  send and recv buffer arguments.  Thanks to Valentin Petrov for reporting.
- Fix a problem with atomic operations on PowerPC.  Thanks to Paul
  Hargrove for reporting.

Known issues (to be addressed in v2.0.3):

- See the list of fixes slated for v2.0.3 here:
  https://github.com/open-mpi/ompi/milestone/23

2.0.1 -- 2 September 2016
-------------------------

Bug fixes/minor improvements:

- Short message latency and message rate performance improvements for
  all transports.
- Fix shared memory performance when using RDMA-capable networks.
  Thanks to Tetsuya Mishima and Christoph Niethammer for reporting.
- Fix bandwith performance degredation in the yalla (MXM) PML.  Thanks
  to Andreas Kempf for reporting the issue.
- Fix OpenSHMEM crash when running on non-Mellanox MXM-based networks.
  Thanks to Debendra Das for reporting the issue.
- Fix a crash occuring after repeated calls to MPI_FILE_SET_VIEW with
  predefined datatypes.  Thanks to Eric Chamberland and Matthew
  Knepley for reporting and helping chase down this issue.
- Fix stdin propagation to MPI processes.  Thanks to Jingchao Zhang
  for reporting the issue.
- Fix various runtime and portability issues by updating the PMIx
  internal component to v1.1.5.
- Fix process startup failures on Intel MIC platforms due to very
  large entries in /proc/mounts.
- Fix a problem with use of relative path for specifing executables to
  mpirun/oshrun.  Thanks to David Schneider for reporting.
- Various improvements when running over portals-based networks.
- Fix thread-based race conditions with GNI-based networks.
- Fix a problem with MPI_FILE_CLOSE and MPI_FILE_SET_SIZE.  Thanks
  to Cihan Altinay for reporting.
- Remove all use of rand(3) from within Open MPI so as not to perturb
  applications use of it.  Thanks to Matias Cabral and Noel Rycroft
  for reporting.
- Fix crash in MPI_COMM_SPAWN.
- Fix types for MPI_UNWEIGHTED and MPI_WEIGHTS_EMPTY.  Thanks to
  Lisandro Dalcin for reporting.
- Correctly report the name of MPI_INTEGER16.
- Add some missing MPI constants to the Fortran bindings.
- Fixed compile error when configuring Open MPI with --enable-timing.
- Correctly set the shared library version of libompitrace.so.  Thanks
  to Alastair McKinstry for reporting.
- Fix errors in the MPI_RPUT, MPI_RGET, MPI_RACCUMULATE, and
  MPI_RGET_ACCUMULATE Fortran bindings.  Thanks to Alfio Lazzaro and
  Joost VandeVondele for tracking this down.
- Fix problems with use of derived datatypes in non-blocking
  collectives.  Thanks to Yuki Matsumoto for reporting.
- Fix problems with OpenSHMEM header files when using CMake.  Thanks to
  Paul Kapinos for reporting the issue.
- Fix problem with use use of non-zero lower bound datatypes in
  collectives.  Thanks to Hristo Iliev for reporting.
- Fix a problem with memory allocation within MPI_GROUP_INTERSECTION.
  Thanks to Lisandro Dalcin for reporting.
- Fix an issue with MPI_ALLGATHER for communicators that don't consist
  of two ranks.  Thanks to David Love for reporting.
- Various fixes for collectives when used with esoteric MPI datatypes.
- Fixed corner cases of handling DARRAY and HINDEXED_BLOCK datatypes.
- Fix a problem with filesystem type check for OpenBSD.
  Thanks to Paul Hargrove for reporting.
- Fix some debug input within Open MPI internal functions.  Thanks to
  Durga Choudhury for reporting.
- Fix a typo in a configury help message.  Thanks to Paul Hargrove for
  reporting.
- Correctly support MPI_IN_PLACE in MPI_[I]ALLTOALL[V|W] and
  MPI_[I]EXSCAN.
- Fix alignment issues on SPARC platforms.

Known issues (to be addressed in v2.0.2):

- See the list of fixes slated for v2.0.2 here:
  https://github.com/open-mpi/ompi/milestone/20, and
  https://github.com/open-mpi/ompi-release/milestone/19
  (note that the "ompi-release" Github repo will be folded/absorbed
  into the "ompi" Github repo at some point in the future)

2.0.0 -- 12 July 2016
---------------------

 **********************************************************************
 *  Open MPI is now fully MPI-3.1 compliant
 **********************************************************************

Major new features:

- Many enhancements to MPI RMA.  Open MPI now maps MPI RMA operations
  on to native RMA operations for those networks which support this
  capability.
- Greatly improved support for MPI_THREAD_MULTIPLE (when configured
  with --enable-mpi-thread-multiple).
- Enhancements to reduce the memory footprint for jobs at scale.  A
  new MCA parameter, "mpi_add_procs_cutoff", is available to set the
  threshold for using this feature.
- Completely revamped support for memory registration hooks when using
  OS-bypass network transports.
- Significant OMPIO performance improvements and many bug fixes.
- Add support for PMIx - Process Management Interface for Exascale.
  Version 1.1.2 of PMIx is included internally in this release.
- Add support for PLFS file systems in Open MPI I/O.
- Add support for UCX transport.
- Simplify build process for Cray XC systems.  Add support for
  using native SLURM.
- Add a --tune mpirun command line option to simplify setting many
  environment variables and MCA parameters.
- Add a new MCA parameter "orte_default_dash_host" to offer an analogue
  to the existing "orte_default_hostfile" MCA parameter.
- Add the ability to specify the number of desired slots in the mpirun
  --host option.

Changes in behavior compared to prior versions:

- In environments where mpirun cannot automatically determine the
  number of slots available (e.g., when using a hostfile that does not
  specify "slots", or when using --host without specifying a ":N"
  suffix to hostnames), mpirun now requires the use of "-np N" to
  specify how many MPI processes to launch.
- The MPI C++ bindings -- which were removed from the MPI standard in
  v3.0 -- are no longer built by default and will be removed in some
  future version of Open MPI.  Use the --enable-mpi-cxx-bindings
  configure option to build the deprecated/removed MPI C++ bindings.
- ompi_info now shows all components, even if they do not have MCA
  parameters.  The prettyprint output now separates groups with a
  dashed line.
- OMPIO is now the default implementation of parallel I/O, with the
  exception for Lustre parallel filesystems (where ROMIO is still the
  default).  The default selection of OMPI vs. ROMIO can be controlled
  via the "--mca io ompi|romio" command line switch to mpirun.
- Per Open MPI's versioning scheme (see the README), increasing the
  major version number to 2 indicates that this version is not
  ABI-compatible with prior versions of Open MPI.  You will need to
  recompile MPI and OpenSHMEM applications to work with this version
  of Open MPI.
- Removed checkpoint/restart code due to loss of maintainer. :-(
- Change the behavior for handling certain signals when using PSM and
  PSM2 libraries.  Previously, the PSM and PSM2 libraries would trap
  certain signals in order to generate tracebacks.  The mechanism was
  found to cause issues with Open MPI's own error reporting mechanism.
  If not already set, Open MPI now sets the IPATH_NO_BACKTRACE and
  HFI_NO_BACKTRACE environment variables to disable PSM/PSM2's
  handling these signals.

Removed legacy support:

- Removed support for OS X Leopard.
- Removed support for Cray XT systems.
- Removed VampirTrace.
- Removed support for Myrinet/MX.
- Removed legacy collective module:ML.
- Removed support for Alpha processors.
- Removed --enable-mpi-profiling configure option.

Known issues (to be addressed in v2.0.1):

- See the list of fixes slated for v2.0.1 here:
  https://github.com/open-mpi/ompi/milestone/16, and
  https://github.com/open-mpi/ompi-release/milestone/16
  (note that the "ompi-release" Github repo will be folded/absorbed
  into the "ompi" Github repo at some point in the future)

- ompi-release#986: Fix data size counter for large ops with fcoll/static
- ompi-release#987: Fix OMPIO performance on Lustre
- ompi-release#1013: Fix potential inconsistency in btl/openib default settings
- ompi-release#1014: Do not return MPI_ERR_PENDING from collectives
- ompi-release#1056: Remove dead profile code from oshmem
- ompi-release#1081: Fix MPI_IN_PLACE checking for IALLTOALL{V|W}
- ompi-release#1081: Fix memchecker in MPI_IALLTOALLW
- ompi-release#1081: Support MPI_IN_PLACE in MPI_(I)ALLTOALLW and MPI_(I)EXSCAN
- ompi-release#1107: Allow future PMIx support for RM spawn limits
- ompi-release#1108: Fix sparse group process reference counting
- ompi-release#1109: If specified to be oversubcribed, disable binding
- ompi-release#1122: Allow NULL arrays for empty datatypes
- ompi-release#1123: Fix signed vs. unsigned compiler warnings
- ompi-release#1123: Make max hostname length uniform across code base
- ompi-release#1127: Fix MPI_Compare_and_swap
- ompi-release#1127: Fix MPI_Win_lock when used with MPI_Win_fence
- ompi-release#1132: Fix typo in help message for --enable-mca-no-build
- ompi-release#1154: Ensure pairwise coll algorithms disqualify themselves properly
- ompi-release#1165: Fix typos in debugging/verbose message output
- ompi-release#1178: Fix ROMIO filesystem check on OpenBSD 5.7
- ompi-release#1197: Fix Fortran pthread configure check
- ompi-release#1205: Allow using external PMIx 1.1.4 and 2.0
- ompi-release#1215: Fix configure to support the NAG Fortran compiler
- ompi-release#1220: Fix combiner args for MPI_HINDEXED_BLOCK
- ompi-release#1225: Fix combiner args for MPI_DARRAY
- ompi-release#1226: Disable old memory hooks with recent gcc versions
- ompi-release#1231: Fix new "patcher" support for some XLC platforms
- ompi-release#1244: Fix Java error handling
- ompi-release#1250: Ensure TCP is not selected for RDMA operations
- ompi-release#1252: Fix verbose output in coll selection
- ompi-release#1253: Set a default name for user-defined MPI_Op
- ompi-release#1254: Add count==0 checks in some non-blocking colls
- ompi-release#1258: Fix "make distclean" when using external pmix/hwloc/libevent
- ompi-release#1260: Clean up/uniform mca/coll/base memory management
- ompi-release#1261: Remove "patcher" warning message for static builds
- ompi-release#1263: Fix IO MPI_Request for 0-size read/write
- ompi-release#1264: Add blocking fence for SLURM operations

Bug fixes / minor enhancements:

- Updated internal/embedded copies of third-party software:
  - Update the internal copy of ROMIO to that which shipped in MPICH
    3.1.4.
  - Update internal copy of libevent to v2.0.22.
  - Update internal copy of hwloc to v1.11.2.
- Notable new MCA parameters:
  - opal_progress_lp_call_ration: Control how often low-priority
    callbacks are made during Open MPI's main progress loop.
  - opal_common_verbs_want_fork_support: This replaces the
    btl_openib_want_fork_support parameter.
- Add --with-platform-patches-dir configure option.
- Add --with-pmi-libdir configure option for environments that install
  PMI libs in a non-default location.
- Various configure-related compatibility updates for newer versions
  of libibverbs and OFED.
- Numerous fixes/improvements to orte-dvm.  Special thanks to Mark
  Santcroos for his help.
- Fix a problem with timer code on ia32 platforms.  Thanks to
  Paul Hargrove for reporting this and providing a patch.
- Fix a problem with use of a 64 bit atomic counter.  Thanks to
  Paul Hargrove for reporting.
- Fix a problem with singleton job launching.  Thanks to Lisandro
  Dalcin for reporting.
- Fix a problem with use of MPI_UNDEFINED with MPI_COMM_SPLIT_TYPE.
  Thanks to Lisandro Dalcin for reporting.
- Silence a compiler warning in PSM MTL.  Thanks to Adrian Reber for
  reporting this.
- Properly detect Intel TrueScale and OmniPath devices in the ACTIVE
  state.  Thanks to Durga Choudhury for reporting the issue.
- Fix detection and use of Solaris Studio 12.5 (beta) compilers.
  Thanks to Paul Hargrove for reporting and debugging.
- Fix various small memory leaks.
- Allow NULL arrays when creating empty MPI datatypes.
- Replace use of alloca with malloc for certain datatype creation
  functions.  Thanks to Bogdan Sataric for reporting this.
- Fix use of MPI_LB and MPI_UB in creation of of certain MPI datatypes.
  Thanks to Gus Correa for helping to fix this.
- Implement a workaround for a GNU Libtool problem.  Thanks to Eric
  Schnetter for reporting and fixing.
- Improve hcoll library detection in configure.  Thanks to David
  Shrader and Ake Sandgren for reporting this.
- Miscellaneous minor bug fixes in the hcoll component.
- Miscellaneous minor bug fixes in the ugni component.
- Fix problems with XRC detection in OFED 3.12 and older releases.
  Thanks to Paul Hargrove for his analysis of this problem.
- Update (non-standard/experimental) Java MPI interfaces to support
  MPI-3.1 functionality.
- Fix an issue with MCA parameters for Java bindings.  Thanks to
  Takahiro Kawashima and Siegmar Gross for reporting this issue.
- Fix a problem when using persistent requests in the Java bindings.
  Thanks to Nate Chambers for reporting.
- Fix problem with Java bindings on OX X 10.11.  Thanks to Alexander
  Daryin for reporting this issue.
- Fix a performance problem for large messages for Cray XC systems.
  Thanks to Jerome Vienne for reporting this.
- Fix an issue with MPI_WIN_LOCK_ALL.  Thanks to Thomas Jahns for
  reporting.
- Fix an issue with passing a parameter to configure multiple times.
  Thanks to QuesarVII for reporting and supplying a fix.
- Add support for ALPS resource allocation system on Cray CLE 5.2 and
  later.  Thanks to Mark Santcroos.
- Corrections to the HACKING file.  Thanks to Maximilien Levesque.
- Fix an issue with user supplied reduction operator functions.
  Thanks to Rupert Nash for reporting this.
- Fix an issue with an internal list management function.  Thanks to
  Adrian Reber for reporting this.
- Fix a problem with MPI-RMA PSCW epochs.  Thanks to Berk Hess for
  reporting this.
- Fix a problem in neighborhood collectives.  Thanks to Lisandro
  Dalcin for reporting.
- Fix MPI_IREDUCE_SCATTER_BLOCK for a one-process communicator. Thanks
  to Lisandro Dalcin for reporting.
- Add (Open MPI-specific) additional flavors to MPI_COMM_SPLIT_TYPE.
  See MPI_Comm_split_type(3) for details.  Thanks to Nick Andersen for
  supplying this enhancement.
- Improve closing of file descriptors during the job launch phase.
  Thanks to Piotr Lesnicki for reporting and providing this
  enhancement.
- Fix a problem in MPI_GET_ACCUMULATE and MPI_RGET_ACCUMULATE when
  using Portals4.  Thanks to Nicolas Chevalier for reporting.
- Use correct include file for lstat prototype in ROMIO. Thanks to
  William Throwe for finding and providing a fix.
- Add missing Fortran bindings for MPI_WIN_ALLOCATE.  Thanks to Christoph
  Niethammer for reporting and fixing.
- Fortran related fixes to handle Intel 2016 compiler.  Thanks to
  Fabrice Roy for reporting this.
- Fix a Fortran linkage issue.  Thanks to Macro Atzeri for finding and
  suggesting a fix.
- Fix problem with using BIND(C) for Fortran bindings with logical
  parameters.  Thanks to Paul Romano for reporting.
- Fix an issue with use of DL-related macros in opal library.  Thanks to
  Scott Atchley for finding this.
- Fix an issue with parsing mpirun command line options which contain
  colons.  Thanks to Lev Given for reporting.
- Fix a problem with Open MPI's package configury files.  Thanks to
  Christoph Junghans for reporting.
- Fix a typo in the MPI_INTERCOMM_MERGE man page.  Thanks To Harald
  Servat for reporting and correcting.
- Update man pages for non-blocking sends per MPI 3.1 standard.
  Thanks to Alexander Pozdneev for reporting.
- Fix problem when compiling against PVFS2.  Thanks to Dave Love for
  reporting.
- Fix problems with MPI_NEIGHBOR_ALLTOALL{V,W}.  Thanks to Willem
  Vermin for reporting this issue.
- Fix various compilation problems on Cygwin.  Thanks to Marco Atzeri
  for supplying these fixes.
- Fix problem with resizing of subarray and darray data types.  Thanks
  to Keith Bennett and Dan Garmann for reporting.
- Fix a problem with MPI_COMBINER_RESIZED.  Thanks to James Ramsey for
  the report.
- Fix an hwloc binding issue.  Thanks to Ben Menadue for reporting.
- Fix a problem with the shared memory (sm) BTL.  Thanks to Peter Wind
  for the report.
- Fixes for heterogeneous support. Thanks to Siegmar Gross for reporting.
- Fix a problem with memchecker.  Thanks to Clinton Simpson for reporting.
- Fix a problem with MPI_UNWEIGHTED in topology functions.  Thanks to
  Jun Kudo for reporting.
- Fix problem with a MCA parameter base filesystem types.  Thanks to
  Siegmar Gross for reporting.
- Fix a problem with some windows info argument types.  Thanks to
  Alastair McKinstry for reporting.

1.10.7 -- 16 May 2017
---------------------

- Fix bug in TCP BTL that impacted performance on 10GbE (and faster)
  networks by not adjusting the TCP send/recv buffer sizes and using
  system default values
- Add missing MPI_AINT_ADD and MPI_AINT_DIFF function delcarations in
  mpif.h
- Fixed time reported by MPI_WTIME; it was previously reported as
  dependent upon the CPU frequency.
- Fix platform detection on FreeBSD
- Fix a bug in the handling of MPI_TYPE_CREATE_DARRAY in
  MPI_(R)(GET_)ACCUMULATE
- Fix openib memory registration limit calculation
- Add missing MPI_T_PVAR_SESSION_NULL in mpi.h
- Fix "make distcheck" when using external hwloc and/or libevent packages
- Add latest ConnectX-5 vendor part id to OpenIB device params
- Fix race condition in the UCX PML
- Fix signal handling for rsh launcher
- Fix Fortran compilation errors by removing MPI_SIZEOF in the Fortran
  interfaces when the compiler does not support it
- Fixes for the pre-ignore-TKR "mpi" Fortran module implementation
  (i.e., for older Fortran compilers -- these problems did not exist
  in the "mpi" module implementation for modern Fortran compilers):
  - Add PMPI_* interfaces
  - Fix typo in MPI_FILE_WRITE_AT_ALL_BEGIN interface name
  - Fix typo in MPI_FILE_READ_ORDERED_BEGIN interface name
- Fixed the type of MPI_DISPLACEMENT_CURRENT in all Fortran interfaces
  to be an INTEGER(KIND=MPI_OFFSET_KIND).
- Fixed typos in MPI_INFO_GET_* man pages.  Thanks to Nicolas Joly for
  the patch
- Fix typo bugs in wrapper compiler script

1.10.6 -- 17 Feb 2017
---------------------

- Fix bug in timer code that caused problems at optimization settings
  greater than 2
- OSHMEM: make mmap allocator the default instead of sysv or verbs
- Support MPI_Dims_create with dimension zero
- Update USNIC support
- Prevent 64-bit overflow on timer counter
- Add support for forwarding signals
- Fix bug that caused truncated messages on large sends over TCP BTL
- Fix potential infinite loop when printing a stacktrace

1.10.5 -- 19 Dec 2016
---------------------

- Update UCX APIs
- Fix bug in darray that caused MPI/IO failures
- Use a MPI_Get_library_version() like string to tag the debugger DLL.
  Thanks to Alastair McKinstry for the report
- Fix multi-threaded race condition in coll/libnbc
- Several fixes to OSHMEM
- Fix bug in UCX support due to uninitialized field
- Fix MPI_Ialltoallv with MPI_IN_PLACE and without MPI param check
- Correctly reset receive request type before init. Thanks Chris Pattison
  for the report and test case.
- Fix bug in iallgather[v]
- Fix concurrency issue with MPI_Comm_accept. Thanks to Pieter Noordhuis
  for the patch
- Fix ompi_coll_base_{gather,scatter}_intra_binomial
- Fixed an issue with MPI_Type_get_extent returning the wrong extent
  for distributed array datatypes.
- Re-enable use of rtdtsc instruction as a monotonic clock source if
  the processor has a core-invariant tsc. This is a partial fix for a
  performance regression introduced in Open MPI v1.10.3.


1.10.4 -- 01 Sept 2016
----------------------

- Fix assembler support for MIPS
- Improve memory handling for temp buffers in collectives
- Fix [all]reduce with non-zero lower bound datatypes
  Thanks Hristo Iliev for the report
- Fix non-standard ddt handling. Thanks Yuki Matsumoto for the report
- Various libnbc fixes. Thanks Yuki Matsumoto for the report
- Fix typos in request RMA bindings for Fortran. Thanks to @alazzaro
  and @vondele for the assist
- Various bug fixes and enhancements to collective support
- Fix predefined types mapping in hcoll
- Revive the coll/sync component to resolve unexpected message issues
  during tight loops across collectives
- Fix typo in wrapper compiler for Fortran static builds


1.10.3 -- 15 June 2016
----------------------

- Fix zero-length datatypes.  Thanks to Wei-keng Liao for reporting
  the issue.
- Minor manpage cleanups
- Implement atomic support in OSHMEM/UCX
- Fix support of MPI_COMBINER_RESIZED. Thanks to James Ramsey
  for the report
- Fix computation of #cpus when --use-hwthread-cpus is used
- Add entry points for Allgatherv, iAllgatherv, Reduce, and iReduce
  for the HCOLL library
- Fix an HCOLL integration bug that could signal completion of request
  while still being worked
- Fix computation of cores when SMT is enabled. Thanks to Ben Menadue
  for the report
- Various USNIC fixes
- Create a datafile in the per-proc directory in order to make it
  unique per communicator. Thanks to Peter Wind for the report
- Fix zero-size malloc in one-sided pt-to-pt code. Thanks to Lisandro
  Dalcin for the report
- Fix MPI_Get_address when passed MPI_BOTTOM to not return an error.
  Thanks to Lisandro Dalcin for the report
- Fix MPI_TYPE_SET_ATTR with NULL value. Thanks to Lisandro Dalcin for
  the report
- Fix various Fortran08 binding issues
- Fix memchecker no-data case. Thanks to Clinton Stimpson for the report
- Fix CUDA support under OS-X
- Fix various OFI/MTL integration issues
- Add MPI_T man pages
- Fix one-sided pt-to-pt issue by preventing communication from happening
  before a target enters a fence, even in the no-precede case
- Fix a bug that disabled Totalview for MPMD use-case
- Correctly support MPI_UNWEIGHTED in topo-graph-neighbors. Thanks to
  Jun Kudo for the report
- Fix singleton operations under SLURM when PMI2 is enabled
- Do not use MPI_IN_PLACE in neighborhood collectives for non-blocking
  collectives (libnbc). Thanks to Jun Kudo for the report
- Silence autogen deprecation warnings for newer versions of Perl
- Do not return MPI_ERR_PENDING from collectives
- Use type int* for MPI_WIN_DISP_UNIT, MPI_WIN_CREATE_FLAVOR, and MPI_WIN_MODEL.
  Thanks to Alastair McKinstry for the report
- Fix register_datarep stub function in IO/OMPIO. Thanks to Eric
  Chamberland for the report
- Fix a bus error on MPI_WIN_[POST,START] in the shared memory one-sided component
- Add several missing MPI_WIN_FLAVOR constants to the Fortran support
- Enable connecting processes from different subnets using the openib BTL
- Fix bug in basic/barrier algorithm in OSHMEM
- Correct process binding for the --map-by node case
- Include support for subnet-to-subnet routing over InfiniBand networks
- Fix usnic resource check
- AUTHORS: Fix an errant reference to Subversion IDs
- Fix affinity for MPMD jobs running under LSF
- Fix many Fortran binding bugs
- Fix `MPI_IN_PLACE`-related bugs
- Fix PSM/PSM2 support for singleton operations
- Ensure MPI transports continue to progress during RTE barriers
- Update HWLOC to 1.9.1 end-of-series
- Fix a bug in the Java command line parser when the
  -Djava.library.path options was given by the user
- Update the MTL/OFI provider selection behavior
- Add support for clock_gettime on Linux.
- Correctly detect and configure for Solaris Studio 12.5
  beta compilers
- Correctly compute #slots when -host is used for MPMD case
- Fix a bug in the hcoll collectives due to an uninitialized field
- Do not set a binding policy when oversubscribing a node
- Fix hang in intercommunicator operations when oversubscribed
- Speed up process termination during MPI_Abort
- Disable backtrace support by default in the PSM/PSM2 libraries to
  prevent unintentional conflicting behavior.


1.10.2 -- 26 Jan 2016
---------------------

 **********************************************************************
 *  OSHMEM is now 1.2 compliant
 **********************************************************************

- Fix NBC_Copy for legitimate zero-size messages
- Fix multiple bugs in OSHMEM
- Correctly handle mpirun --host <user>@<ip-address>
- Centralize two MCA params to avoid duplication between OMPI and
  OSHMEM layers: opal_abort_delay and opal_abort_print_stack
- Add support for Fujitsu compilers
- Add UCX support for OMPI and OSHMEM
- Correctly handle oversubscription when not given directives
  to permit it. Thanks to @ammore1 for reporting it
- Fix rpm spec file to not include the /usr directory
- Add Intel HFI1 default parameters for the openib BTL
- Resolve symbol conflicts in the PSM2 library
- Add ability to empty the rgpusm cache when full if requested
- Fix another libtool bug when -L requires a space between it
  and the path. Thanks to Eric Schnetter for the patch.
- Add support for OSHMEM v1.2 APIs
- Improve efficiency of oshmem_preconnect_all algorithm
- Fix bug in buffered sends support
- Fix double free in edge case of mpirun. Thanks to @jsharpe for
  the patch
- Multiple one-sided support fixes
- Fix integer overflow in the tuned "reduce" collective when
  using buffers larger than INT_MAX in size
- Fix parse of user environment variables in mpirun. Thanks to
  Stefano Garzarella for the patch
- Performance improvements in PSM2 support
- Fix NBS iBarrier for inter-communicators
- Fix bug in vader BTL during finalize
- Improved configure support for Fortran compilers
- Fix rank_file mapper to support default --slot-set. Thanks
  to Matt Thompson for reporting it
- Update MPI_Testsome man page. Thanks to Eric Schnetter for
  the suggestion
- Fix missing resize of the returned type for subarray and
  darray types. Thanks to Keith Bennett and Dan Garmann for
  reporting it
- Fix Java support on OSX 10.11. Thanks to Alexander Daryin
  for reporting the problem
- Fix some compilation issues on Solaris 11.2. Thanks to
  Paul Hargrove for his continued help in such areas


1.10.1 -- 4 Nov 2015
--------------------

- Workaround an optimization problem with gcc compilers >= 4.9.2 that
  causes problems with memory registration, and forced
  mpi_leave_pinned to default to 0 (i.e., off).  Thanks to @oere for
  the fix.
- Fix use of MPI_LB and MPI_UB in subarray and darray datatypes.
  Thanks to Gus Correa and Dimitar Pashov for pointing out the issue.
- Minor updates to mpi_show_mpi_alloc_mem_leaks and
  ompi_debug_show_handle_leaks functionality.
- Fix segv when invoking non-blocking reductions with a user-defined
  operation.  Thanks to Rupert Nash and Georg Geiser for identifying
  the issue.
- No longer probe for PCI topology on Solaris (unless running as root).
- Fix for Intel Parallel Studio 2016 ifort partial support of the
  !GCC$ pragma.  Thanks to Fabrice Roy for reporting the problem.
- Bunches of Coverity / static analysis fixes.
- Fixed ROMIO to look for lstat in <sys/stat.h>.  Thanks to William
  Throwe for submitting the patch both upstream and to Open MPI.
- Fixed minor memory leak when attempting to open plugins.
- Fixed type in MPI_IBARRIER C prototype.  Thanks to Harald Servat for
  reporting the issue.
- Add missing man pages for MPI_WIN_CREATE_DYNAMIC, MPI_WIN_ATTACH,
  MPI_WIN_DETACH, MPI_WIN_ALLOCATE, MPI_WIN_ALLOCATE_SHARED.
- When mpirun-launching new applications, only close file descriptors
  that are actually open (resulting in a faster launch in some
  environments).
- Fix "test ==" issues in Open MPI's configure script.  Thank to Kevin
  Buckley for pointing out the issue.
- Fix performance issue in usnic BTL: ensure progress thread is
  throttled back to not aggressively steal CPU cycles.
- Fix cache line size detection on POWER architectures.
- Add missing #include in a few places.  Thanks to Orion Poplawski for
  supplying the patch.
- When OpenSHMEM building is disabled, no longer install its header
  files, help files, or man pages.  Add man pages for oshrun, oshcc,
  and oshfort.
- Fix mpi_f08 implementations of MPI_COMM_SET_INFO, and profiling
  versions of MPI_BUFFER_DETACH, MPI_WIN_ALLOCATE,
  MPI_WIN_ALLOCATE_SHARED, MPI_WTICK, and MPI_WTIME.
- Add orte_rmaps_dist_device MCA param, allowing users to map near a
  specific device.
- Various updates/fixes to the openib BTL.
- Add missing defaults for the Mellanox ConnectX 3 card to the openib BTL.
- Minor bug fixes in the OFI MTL.
- Various updates to Mellanox's MXM, hcoll, and FCA components.
- Add OpenSHMEM man pages.  Thanks to Tony Curtis for sharing the man
  pages files from openshmem.org.
- Add missing "const" attributes to MPI_COMPARE_AND_SWAP,
  MPI_FETCH_AND_OP, MPI_RACCUMULATE, and MPI_WIN_DETACH prototypes.
  Thanks to Michael Knobloch and Takahiro Kawashima for bringing this
  to our attention.
- Fix linking issues on some platforms (e.g., SLES 12).
- Fix hang on some corner cases when MPI applications abort.
- Add missing options to mpirun man page. Thanks to Daniel Letai
  for bringing this to our attention.
- Add new --with-platform-patches-dir configure option
- Adjust relative selection priorities to ensure that MTL
  support is favored over BTL support when both are available
- Use CUDA IPC for all sized messages for performance


1.10.0 -- 25 Aug 2015
---------------------

** NOTE: The v1.10.0 release marks the transition to Open MPI's new
** version numbering scheme.  The v1.10.x release series is based on
** the v1.8.x series, but with a few new features.  v2.x will be the
** next series after the v1.10.x series, and complete the transition
** to the new version numbering scheme.  See README for more details
** on the new versioning scheme.
**
** NOTE: In accordance with OMPI version numbering, the v1.10 is *not*
** API compatible with the v1.8 release series.

- Added libfabric support (see README for more details):
  - usNIC BTL updated to use libfabric.
  - Added OFI MTL (usable with PSM in libfabric v1.1.0).
- Added Intel Omni-Path support via new PSM2 MTL.
- Added "yalla" PML for faster MXM support.
- Removed support for MX
- Added persistent distributed virtual machine (pDVM) support for fast
  workflow executions.
- Fixed typo in GCC inline assembly introduced in Open MPI v1.8.8.
  Thanks to Paul Hargrove for pointing out the issue.
- Add missing man pages for MPI_Win_get|set_info(3).
- Ensure that session directories are cleaned up at the end of a run.
- Fixed linking issues on some OSs where symbols of dependent
  libraries are not automatically publicly available.
- Improve hcoll and fca configury library detection.  Thanks to David
  Shrader for helping track down the issue.
- Removed the LAMA mapper (for use in setting affinity).  Its
  functionality has been largely superseded by other mpirun CLI
  options.
- CUDA: Made the asynchronous copy mode be the default.
- Fix a malloc(0) warning in MPI_IREDUCE_SCATTER_BLOCK.  Thanks to
  Lisandro Dalcin for reporting the issue.
- Fix typo in MPI_Scatter(3) man page.  Thanks to Akshay Venkatesh for
  noticing the mistake.
- Add rudimentary protection from TCP port scanners.
- Fix typo in Open MPI error handling.  Thanks to Ake Sandgren for
  pointing out the error.
- Increased the performance of the CM PML (i.e., the Portals, PSM,
  PSM2, MXM, and OFI transports).
- Restored visibility of blocking send requests in message queue
  debuggers (e.g., TotalView, DDT).
- Fixed obscure IPv6-related bug in the TCP BTL.
- Add support for the "no_locks" MPI_Info key for one-sided
  functionality.
- Fixed ibv_fork support for verbs-based networks.
- Fixed a variety of small bugs in OpenSHMEM.
- Fixed MXM configure with additional CPPFLAGS and LDFLAGS.  Thanks to
  David Shrader for the patch.
- Fixed incorrect memalign threshhold in the openib BTL.  Thanks to
  Xavier Besseron for pointing out the issue.