Containers and HPC

Containers address the ever growing complexities of software by storing an application, its dependencies and configuration in a single image making it simple for users to pull images and run their code. This makes the software shareable and portable.

Singularity or Docker?

Singularity is an open source container platform designed to be simple, fast, and secure. Unlike Docker containers which requires root privileges to run containers, Singularity is designed for ease-of-use on shared multiuser systems and in high performance computing (HPC) environments. Singularity is compatible with all Docker images and it can be used with GPUs and MPI applications.

Benefits of using containers

Containerized workflows
Portable software environment that make it easy to collaborate
Access to software not compatible with cluster OS
Build and use your own software
Share software among research groups
Bundles complex software for easy distribution

Pulling Containers

When pulling containers, by default Singularity will create a ~/.singularity directory within your home directory to store cache files. Overtime the cache directory may use a significant amount of your /home quota. You may want to consider exporting the following environment variable to point towards your scratch directory.

export SINGULARITY_CACHEDIR=/scratch/<username>/.singularity

You can also add the previous commands within your ~/.bashrc file to automatically set the variables each time you log in.

As stated early the cache directory will store image information, making downloading images faster and less redundant. To view cache information you can use the following command to see the number of container files and the size of the files.

singularity cache list

The following command will give you more information about the date, type, and size of the files. This may be useful when deciding to clean up your cache files.

singularity cache list --verbose

There may be times you wish to delete cache information. You can clear the cache by type the following

singularity cache clean

To clean singularity cache by type, you can use the following

singularity cache clean --type=blob

There are many popular container registries to pull images from. When downloading Singularity containers from Singularity Cloud Library, use the library:// prefix

singularity pull library://ubuntu

You can also download Docker containers via Dockerhub using the docker:// prefix. Most docker container images will convert to the Singularity image format

singularity pull docker://ubuntu

Pulling images will result in creating and storing a Singularity Image File(sif) file within the current directory.

Images can be inspected for basic information using the following commands.

singularity inspect ubuntu.sif

Singularity inspect will allow you to see environment information, list apps within the container, and allow you to see the runscript. To view all options use the following command below

$ singularity inspect --help

Run and exec

Singularity exec - Use for running a command within the container
Singularity run - Use for running a pre-defined runscript within the container
Singularity shell- Use For an interactive shell within the container

Singularity shell command is typically used to run a shell within the container. It is very useful for trying to gather information about files within the container. In this example, shell into the ubuntu.sif

> singularity shell ubuntu.sif
Singularity>

> cat /etc/os-release
NAME="Ubuntu"
VERSION="20.04.3 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.3 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"

> exit

After exiting the container, run the following command

$ cat /etc/os-release

You will see the os of the HPC cluster is different from that of the ubuntu.sif. While within the singularity shell you will have access to your home directory. Run the following commands to gather an idea of what is available and accessible within the container. Be aware that most shell commands you use on the HPC cluster may not be installed within a container. For example if you are used to using the nano text editor, it may not be installed within the ubuntu container.

$ ls
$ pwd
$ whoami

Singularity run command will allow you to run the user-defined command within a container. Some containers do not have a default command.

$ singularity run  <container image> <arg-1> <arg-2>

The runscript contains the user-defined commands within the container. To inspect the runscript, should one exist, before issuing the singularity run command, type the following:

$ singularity inspect --runscript <container_image>

Singularity exec command will allow you to run a command within the container.

$ singularity exec <container image> <commands> <script>

Example: In this example we have a gcc_8.4.0.sif and would like to compile this helloworld.c code we have in the current directory. You can accomplish this by doing the following:

$ singularity exec gcc_8.4.0.sif gcc helloworld.c -o hello

Job Submission

#!/bin/bash
#SBATCH --job-name=testJob          # job name
#SBATCH --nodes=1                   # node(s) required for job
#SBATCH --ntasks=48                 # number of tasks per node
#SBATCH --partition=general         # name of partition to submit job
#SBATCH --output=test-%j.out        # Output file. %j is replaced with job ID
#SBATCH --error=test_error-%j.err   # Error file. %j is replaced with job ID
#SBATCH --time=01:00:00             # Run time (D-HH:MM:SS)
#SBATCH --mail-type=ALL             # will send email for begin,end,fail
#SBATCH --mail-user=user@auburn.edu

module load singularity

singularity exec /path/to/<container image> <arg1> <arg2>

Parallel MPI Code:

#!/bin/bash
#SBATCH --job-name=testJob          # job name
#SBATCH --nodes=2                   # node(s) required for job
#SBATCH --ntasks=48                 # number of tasks per node
#SBATCH --partition=general         # name of partition to submit job
#SBATCH --output=test-%j.out        # Output file. %j is replaced with job ID
#SBATCH --error=test_error-%j.err   # Error file. %j is replaced with job ID
#SBATCH --time=01:00:00             # Run time (D-HH:MM:SS)
#SBATCH --mail-type=ALL             # will send email for begin,end,fail
#SBATCH --mail-user=user@auburn.edu

module load singularity
module load openmpi/4.0.3

srun singularity exec /path/to/<container image> <arg1> <arg2>

Note: Srun is called from outside the image. The MPI library within the container will work in tandem with the MPI version on the cluster in a hybrid approach. For this reason, the MPI implementation within the container must be compatible with the MPI implementation on the host machine. You may need to try multiple Open MPI modues before you find one that works. For more information on Singularity and MPI use the following link. https://sylabs.io/guides/latest/user-guide/mpi.html

Building an Image

You will not be able to build images while using Singularity on the Easley cluster. It is recommended that images are created using your local machine and transfered to the cluster. Sylabs.io provides a remote builder option that will allow users to securely build a container using a definition file.

https://cloud.sylabs.io/builder