4. Module System
4.1. Introduction
Users all have different software needs, including conflicting version requirements for different projects! Therefore, HPC clusters typically make available compilers, libraries, and general software packages in several different versions. To keep the different versions apart, and to assist users to switch between them, the module system was created.
To use a module system is the classic way of making software available on HPC clusters. Lmod module system is a Lua based software module system that helps to manage the user environment (PATH, LD_LIBRARY_PATH) through module files. Lmod is an extension of environment-modules that supports TCL modules along with hierarchical MODULEPATH. Lmod is the module system used on tux.
A module provides a specific version of a software that is installed on the system. User can activate a software by loading its module into their user environment. The module system then manipulates the PATH and other environment variables. By doing so, the corresponding software becomes available to the user and it is easy for the user to switch between different software versions. Hence, the module system provides a convenient way to dynamically change the users environment through modulefiles and to prevent name clashes.
Note
The central command of the module system is module
or it shorthand equivalent ml
4.2. Using the Module System
4.2.1. Available Modules
To see all available modules, run module avail
or the shorter but equivalent command ml av
:
$ ml av
--------------------------------- /usr/local/etc/lmod/Core -------------------------------
Foss/2022.1 gcc/11.2.0 meson/0.60.0
Intel/2021.4 gh/2.0.0 ninja/1.10.2
anaconda3/2021.05 git/2.31.1 parallel/20210922
anaconda3/2022.05 (D) go/1.17.2 rclone/1.56.2
arm-forge/21.0 intel-compilers/2021.4.0 totalview/2020.0.25
cmake/3.21.4 julia/1.6.3 udunits/2.2.28
comsol/5.4 julia/1.6.6 (D)
comsol/5.6 (D) mercurial/5.8
Where:
D: Default Module
Furthermore, you may also list specific modules that are available. For instance, to look for modules that starts with anacon, we do
$ ml av anacon
--------------------------------- /usr/local/etc/lmod/Core --------------------------------
anaconda3/2021.05 anaconda3/2022.05 (D)
Where:
D: Default Module
4.2.2. Loading Modules
Just after logging onto the system, no modules are loaded. To load the GNU compiler collection (gcc), you can do ml gcc
(or module load gcc
), which will produce no output. This will load the default version of gcc
installed on the system. This example illustrates the process
$ which gcc
/usr/bin/gcc
$ ml gcc
$ which gcc
/share/apps/spack/v1.0.7/software/linux-debian11-x86_64/gcc-10.2.1/gcc-11.2.0-ddklbacfnwoivgqapfyilqh332oze2rc/bin/gcc
It is observed that by executing ml gcc
the gcc compiler is changed from the default OS compiler to version 11.2.0.
If you want to load a specific version of gcc you can e.g. do ml gcc/10.3.0
(general syntax ml <module>/<version>
). When doing so, if a version of gcc already was loaded, this will cause the reloading of gcc so that the current loaded version will be 10.3.0.
Furthermore, also all libraries that were loaded and that depend on the compiler will also be reloaded in order to be consistent with the version of gcc that you just loaded. This helps to keep the set of loaded modules internally consistent.
To see the details of what a is done when a module is loading, you can use ml show <module>
.
To inspect which modules you currently have loaded, simply do ml
(or, equivalently, the longer commands module list
or ml list
)
$ ml
Currently Loaded Modules:
1) intel-compilers/2021.4.0
This example shows that only the module intel-compilers/2021.4.0 is loaded.
Note
The module
and module list
commands can be abbreviated to ml
!
The long command module load <module>
has the short form ml <module>
.
4.2.3. Unloading Modules
To unload a module that is already loaded you may do ml rm <module>
[or any of the equivalent commands module unload <module>, ml unload <module>, or ml -<module>.
Furthermore, doing
$ ml purge
will remove all modules that were loaded prior to executing the command (and the command produce no output).
Note
The command ml purge
is often used in Slurm scripts to ensure that the environment is clean before setting the environment for a particular job.
4.2.4. Hierarchical Modulefiles
We make use of module hierarchies in order to keep a consistent environment for our users.
Hierarchies unify module file names, changing the module paths conveniently as we change the hierarchy.
This means that only libraries or software applications that can be used together with a given compiler are show when doing ml av
.
Our module system has the following hierarchy structure:
Core
Compilers
MPI
LAPACK
An example will illustrate how Module Hierarchies work. For instance, just after login into the login node one gets
$ ml av
--------------------------------- /usr/local/etc/lmod/Core ----------------------------------
Foss/2022.1 gcc/11.2.0 meson/0.60.0
Intel/2021.4 gh/2.0.0 ninja/1.10.2
anaconda3/2021.05 git/2.31.1 parallel/20210922
anaconda3/2022.05 (D) go/1.17.2 rclone/1.56.2
arm-forge/21.0 intel-compilers/2021.4.0 totalview/2020.0.25
cmake/3.21.4 julia/1.6.3 udunits/2.2.28
comsol/5.4 julia/1.6.6 (D)
comsol/5.6 (D) mercurial/5.8
Where:
D: Default Module
but after the Intel compiler is loaded via ml intel-compilers
, one gets
$ ml intel-compilers
$ ml av
-- /share/apps/spack/v1.0.7/spack/../lmod/sandybridge/linux-debian11-x86_64/intel/2021.4.0 --
boost/1.77.0 h5utils/1.13.1 intel-mpi/2021.4.0 netcdf-fortran/4.5.3
cmake/3.21.4 (D) hdf5/1.10.7 intel-tbb/2021.4.0 netcdf/4.8.1
h5cpp/1.10.4-6 intel-mkl/2021.4.0 netcdf-cxx/4.3.1 pgplot/5.2.2
--------------------------------- /usr/local/etc/lmod/Core ----------------------------------
Foss/2022.1 gcc/11.2.0 meson/0.60.0
Intel/2021.4 gh/2.0.0 ninja/1.10.2
anaconda3/2021.05 git/2.31.1 parallel/20210922
anaconda3/2022.05 (D) go/1.17.2 rclone/1.56.2
arm-forge/21.0 intel-compilers/2021.4.0 (L) totalview/2020.0.25
cmake/3.21.4 julia/1.6.3 udunits/2.2.28
comsol/5.4 julia/1.6.6 (D)
comsol/5.6 (D) mercurial/5.8
Where:
L: Module is loaded
D: Default Module
Notice the addition modules under
/share/apps/spack/v1.0.7/spack/../lmod/sandybridge/linux-debian11-x86_64/intel/2021.4.0
that now are available for loading. All these software packages were compiles with the version of the intel compiler that we just loaded and is hence now available for loading.
For instance, the library boost/1.77.0
that we can load via ml boost/1.77.0
will load the version compiled with the Intel compiler that we loaded and not the same version compiled with e.g. gcc (or any other version of the Intel compiler). This is the beauty of using hierarchical module systems as compared to the non-hierarchical module systems that will show all versions of a library and for all compilers used to produce them. In these latter cases the user is responsible for making sure that the versions you load are internally consistent (which may be challenging).
A potential drawback of using hierarchical modulefiles, however, is that you will not necessarily be able to see a library, for instance, if the compiler used to compile it is not already loaded.
This is actually not a problem, since here the command ml spider
comes to the rescue. This command takes a module, or part of a module name, as an argument. To illustrate how it works, take the following example
$ ml spider hdf5
---------------------------------------------------------------------------------
hdf5: hdf5/1.10.7
---------------------------------------------------------------------------------
You will need to load all module(s) on any one of the lines below before
the "hdf5/1.10.7" module is available to load.
gcc/11.2.0
gcc/11.2.0 openmpi/4.1.1
intel-compilers/2021.4.0
intel-compilers/2021.4.0 intel-mpi/2021.4.0
Help:
HDF5 is a data model, library, and file format for storing and managing
data. It supports an unlimited variety of datatypes, and is designed for
flexible and efficient I/O and for high volume and complex data.
which shows that the hdf5 library is available only in version 1.10.7 but in four variants on the system (at the time of writing). If no compiler is loaded, the hdf5 library (most likely) cannot be loaded directly. Furthermore, there is both an MPI and non-MPI version of this library available for a given compiler. So to get the MPI version of hdf5 library when using gcc/11.2.0, for instance, one first needs to load gcc/11.2.0 and then openmpi/4.1.1 in order to load the hdf5/1.10.7.
In summary, if you are looking for a software package, in particular if hierarchical modulefiles are used, do ml spider <package>
to see how to load it (assuming it is installed on the system).
4.3. Useful module commands
The most frequently used commands of the module system are:
Command (full) |
Command (short) |
Purpose |
---|---|---|
module list |
ml |
List active modules in the user environment |
module avail [module] |
ml av [module] |
List available modules in MODULEPATH |
module load <module> |
ml <module> |
Load a module file in the users environment |
module unload <module> |
ml rm <module> |
Remove a loaded module from the user environment |
module purge |
ml purge |
Remove all modules from the user environment |
module swap <module1> <module2> |
ml swap <module1> <module2> |
Replace module1 with module2 |
module spider <module> |
ml spider [module] |
Query all modules in MODULEPATH and any module hierarchy |
module show <module> |
ml show <module> |
Show content of commands performed by loading module file |
Here <module> and [module] denotes required and optional module name, respectively.
As an alternative to the module command, one may use the shorthand ml. A list of how they compare is presented below:
Command |
Abbreviation |
---|---|
module list |
ml |
module avail [module] |
ml av [module] |
module load <module> |
ml <module> |
module unload <module> |
ml rm <module> |
module purge |
ml purge |
module swap <module1> <module2> |
ml swap <module1> <module2> |
module spider <module> |
ml spider [module] |
module show <module> |
ml show <module> |
We prefer the use of the shorthand module commands over the full commands.
If you want to know more about the various module commands, module help
will give you an overview.
More detailed information about the module system and its commands you can find at the lmod homepage and some more specialized commands are
listed here .
4.4. Meta Modules
A meta module denotes a module which main purpose is to load a set of other regular modules. On tux we have defined the following meta modules:
Name |
Description |
Loads |
---|---|---|
Foss |
Free and Open Software Stack (Foss) |
gcc; openmpi; openblas; fftw; scalapack |
Intel |
Intel Software Stack |
intel-compilers; intel-mpi; intel-mkl |
The Intel stacks are based on the Intel oneAPI Toolkits.
The Foss stack:
$ ml purge
$ ml Foss
$ ml
Currently Loaded Modules:
1) gcc/11.2.0 3) openblas/0.3.18 5) scalapack/2.1.0
2) openmpi/4.1.1 4) fftw/3.3.10 6) Foss/2022.1
The Intel stack:
$ ml purge
$ ml Intel
$ ml
Currently Loaded Modules:
1) intel-compilers/2021.4.0 3) intel-mkl/2021.4.0
2) intel-mpi/2021.4.0 4) Intel/2021.4