Managing software with Lmod

Why software needs managing

Almost all software requires that you modify your environment in some way. Your environment consists of the running shell, typically bash on Flux, and the set of environment variables that are set. The most familiar environment variable ot most people is the PATH variable, which lists all the directories in which the shell will search for a command, but there may be many others, depending on the particular software package.

Beginning in July 2016, Flux uses a program called Lmod to resolve the changes needed to accommodate having many versions of the same software installed. We use Lmod to help manage conflicts among the environment variables across the spectrum of software packages. Lmod can be used to modify your own default environment settings, and it is also useful if you install software for your own use.

Basic Lmod usage

Listing, loading, and unloading modules

Lmod provides the module command, an easy mechanism for changing the environment as needed to add or remove software packages from your environment.

This should be done before submitting a job to the cluster and not from within a PBS submit script.

A module is a collection of environment variable settings that can be loaded or unloaded. When you first log into Flux, the system will look to see if you have defined a default module set, and if you have, it will restore that set of modules. See below for information about module sets and how to create them. To see which modules are currently loaded, you can use the command

$ module list

Currently Loaded Modules:
  1) intel/16.0.3   2) openmpi/1.10.2/intel/16.0.3   3) StdEnv

We try to make the names of the modules as close to the official name of the software as we can, so you can see what is available by using, for example,

$ module av matlab

------------------------ /sw/arcts/centos7/modulefiles -------------------------
   matlab/R2016a

Use "module spider" to find all possible modules.
Use "module keyword key1 key2 ..." to search for all possible modules matching
any of the "keys".

where av stands for avail (available). To make the software found available for use, you use

$ module load matlab

(you can also use add instead of load, if you prefer.) If you need to use software that is incompatible with Matlab, you would remove it using

$ module unload matlab

More ways to find modules

In the output from module av matlab, module suggests a couple of alternate ways to search for software. When you use module av, it will match the search string anywhere in the module name; for example,

$ module av gcc

------------------------ /sw/arcts/centos7/modulefiles -------------------------
   fftw/3.3.4/gcc/4.8.5                          hdf5-par/1.8.16/gcc/4.8.5
   fftw/3.3.4/gcc/4.9.3                   (D)    hdf5-par/1.8.16/gcc/4.9.3 (D)
   gcc/4.8.5                                     hdf5/1.8.16/gcc/4.8.5
   gcc/4.9.3                                     hdf5/1.8.16/gcc/4.9.3     (D)
   gcc/5.4.0                              (D)    openmpi/1.10.2/gcc/4.8.5
   gromacs/5.1.2/openmpi/1.10.2/gcc/4.9.3        openmpi/1.10.2/gcc/4.9.3
   gromacs/5.1.2/openmpi/1.10.2/gcc/5.4.0 (D)    openmpi/1.10.2/gcc/5.4.0  (D)

  Where:
   D:  Default Module

However, if you are looking for just gcc, that is more than you really want. So, you can use one of two commands. The first is

$ module spider gcc

----------------------------------------------------------------------------
  gcc:
----------------------------------------------------------------------------
    Description:
      GNU compiler suite

     Versions:
        gcc/4.8.5
        gcc/4.9.3
        gcc/5.4.0

     Other possible modules matches:
        fftw/3.3.4/gcc  gromacs/5.1.2/openmpi/1.10.2/gcc  hdf5-par/1.8.16/gcc  ...

----------------------------------------------------------------------------
  To find other possible module matches do:
      module -r spider '.*gcc.*'

----------------------------------------------------------------------------
  For detailed information about a specific "gcc" module (including how to load
the modules) use the module's full name.
  For example:

     $ module spider gcc/5.4.0
----------------------------------------------------------------------------

That is probably more like what you are looking for if you really are searching just for gcc. That also gives suggestions for alternate searching, but let us return to the first set of suggestions, and see what we get with keyword searching.

At the time of writing, if you were to use module av to look for Python, you would get this result.

[bennet@flux-build-centos7 modulefiles]$ module av python

------------------------ /sw/arcts/centos7/modulefiles -------------------------
   python-dev/3.5.1

However, we have Python distributions that are installed that do not have python as part of the module name. In this case, module spider will also not help. Instead, you can use

$ module keyword python

----------------------------------------------------------------------------
The following modules match your search criteria: "python"
----------------------------------------------------------------------------

  anaconda2: anaconda2/4.0.0
    Python 2 distribution.

  anaconda3: anaconda3/4.0.0
    Python 3 distribution.

  epd: epd/7.6-1
    Enthought Python Distribution

  python-dev: python-dev/3.5.1
    Python is a general purpose programming language

----------------------------------------------------------------------------
To learn more about a package enter:

   $ module spider Foo

where "Foo" is the name of a module

To find detailed information about a particular package you
must enter the version if there is more than one version:

   $ module spider Foo/11.1
----------------------------------------------------------------------------

That displays all the modules that have been tagged with the python keyword or where python appears in the module name.

More about software versions

Note that Lmod will indicate the default version in the output from module av, which will be loaded if you do not specify the version.

$ module av gromacs

------------------------ /sw/arcts/centos7/modulefiles -------------------------
   gromacs/5.1.2/openmpi/1.10.2/gcc/4.9.3
   gromacs/5.1.2/openmpi/1.10.2/gcc/5.4.0 (D)

  Where:
   D:  Default Module

When loading modules with complex names, for example, gromacs/5.1.2/openmpi/1.10.2/gcc/5.4.0, you can specify up to the second-from-last element to load the default version. That is,

$ module load gromacs/5.1.2/openmpi/1.10.2/gcc

will load gromacs/5.1.2/openmpi/1.10.2/gcc/5.4.0

To load a version other than the default, specify the version as it is displayed by the module av command; for example,

$ module load gromacs/5.1.2/openmpi/1.10.2/gcc/4.9.3

When unloading a module, only the base name need be given; for example, if you loaded either gromacs module,

$ module unload gromacs

Module prerequisites and named sets

Some modules rely on other modules. For example, the gromacs module has many dependencies, some of which conflict with the default modules. To load it, you might first clear all modules with module purge, then load the dependencies, then finally load gromacs.

$ module list
Currently Loaded Modules:
  1) intel/16.0.3   2) openmpi/1.10.2/intel/16.0.3   3) StdEnv

$ module purge
$ module load gcc/5.4.0 openmpi/1.10.2/gcc/5.4.0 boost/1.61.0 mkl/11.3.3
$ module load gromacs/5.1.2/openmpi/1.10.2/gcc/5.4.0
$ module list
Currently Loaded Modules:
  1) gcc/5.4.0                  4) mkl/11.3.3
  2) openmpi/1.10.2/gcc/5.4.0   5) gromacs/5.1.2/openmpi/1.10.2/gcc/5.4.0
  3) boost/1.61.0

That’s a lot to do each time. Lmod provides a way to store a set of modules and give it a name. So, once you have the above list of modules loaded, you can use

$ module save my_gromacs

to save the whole list under the name my_gromacs. We recommend that you make each set fully self-contained, and that you use the full name/version for each module (to prevent problems if the default version of one of them changes), then use the combination

$ module purge
$ module restore my_gromacs
Restoring modules to user's my_gromacs

To see a list of the named sets you have (which are stored in ${HOME}/.lmod.d, use

$ module savelist
Named collection list:
  1) my_gromacs

and to see which modules are in a set, use

$ module describe my_gromacs
Collection "my_gromacs" contains: 
   1) gcc/5.4.0                   4) mkl/11.3.3
   2) openmpi/1.10.2/gcc/5.4.0    5) gromacs/5.1.2/openmpi/1.10.2/gcc/5.4.0
   3) boost/1.61.0

How to get more information about the module and the software

We try to provide some helpful information about the modules. For example,

$ module help openmpi/1.10.2/gcc/5.4.0
------------- Module Specific Help for "openmpi/1.10.2/gcc/5.4.0" --------------

OpenMPI consists of a set of compiler 'wrappers' that include the appropriate
settings for compiling MPI programs on the cluster.  The most commonly used
of these are

    mpicc
    mpic++
    mpif90

Those are used in the same way as the regular compiler program, for example,

    $ mpicc -o hello hello.c

will produce an executable program file, hello, from C source code in hello.c.

In addition to adding the OpenMPI executables to your path, the following
environment variables set by the openmpi module.

    $MPI_HOME

For some generic information about the program you can use

$ module whatis openmpi/1.10.2/gcc/5.4.0
openmpi/1.10.2/gcc/5.4.0      : Name: openmpi
openmpi/1.10.2/gcc/5.4.0      : Description: OpenMPI implementation of the MPI protocol
openmpi/1.10.2/gcc/5.4.0      : License information: https://www.open-mpi.org/community/license.php
openmpi/1.10.2/gcc/5.4.0      : Category: Utility, Development, Core
openmpi/1.10.2/gcc/5.4.0      : Package documentation: https://www.open-mpi.org/doc/
openmpi/1.10.2/gcc/5.4.0      : ARC examples: /scratch/data/examples/openmpi/
openmpi/1.10.2/gcc/5.4.0      : Version: 1.10.2

and for information about what the module will set in the environment (in addition to the help text), you can use

$ module show openmpi/1.10.2/gcc/5.4.0
[ . . . .  Help text edited for space -- see above . . . . ]
whatis("Name: openmpi")
whatis("Description: OpenMPI implementation of the MPI protocol")
whatis("License information: https://www.open-mpi.org/community/license.php")
whatis("Category: Utility, Development, Core")
whatis("Package documentation: https://www.open-mpi.org/doc/")
whatis("ARC examples: /scratch/data/examples/openmpi/")
whatis("Version: 1.10.2")
prereq("gcc/5.4.0")
prepend_path("PATH","/sw/arcts/centos7/openmpi/1.10.2-gcc-5.4.0/bin")
prepend_path("MANPATH","/sw/arcts/centos7/openmpi/1.10.2-gcc-5.4.0/share/man")
prepend_path("LD_LIBRARY_PATH","/sw/arcts/centos7/openmpi/1.10.2-gcc-5.4.0/lib")
setenv("MPI_HOME","/sw/arcts/centos7/openmpi/1.10.2-gcc-5.4.0")

where the lines to attend to are the prepend_path(), setenv(), and prereq(). There is also an append_path() function that you may see. The prereq() function sets the list of other modules that must be loaded before the one being displayed. The rest set or modify the environment variable listed as the first argument; for example,

prepend_path("PATH", "/sw/arcts/centos7/openmpi/1.10.2-gcc-5.4.0/bin")

adds /sw/arcts/centos7/openmpi/1.10.2-gcc-5.4.0/bin to the beginning of the PATH environment variable.