Summer 2022 Maintenance

Due to maintenance, the high-performance computing (HPC) clusters and their storage systems (/home and /scratch) will be unavailable:

  • Great Lakes: Monday, August 8 to Tuesday, August 9 
  • Armis2 and Lighthouse: Tuesday, August 9 to Wednesday, August 10 

Attention

  • No jobs will run that cannot be completed by the beginning of maintenance. 
    • Great Lakes maintenance begins on Monday, August 8. 
    • Armis2 and Lighthouse maintenance begins on Tuesday, August 9.
  • All jobs will be deleted before the start of maintenance. 

Action to take before maintenance

After maintenance

  • Submit your jobs to the cluster(s) at your convenience.
  • Python version 2.x will no longer be available once maintenance has been completed. 

How to access the preview clusters

Testing is an important component of the maintenance process. Discovering broken code early will ensure that there is enough time to put a fix in place so that your research is not disrupted. 

ARC recommends that you test if you can:

Here’s how you get access to the preview clusters:

Great Lakes:

  • Command line: ssh greatlakes8-login.arc-ts.umich.edu
  • Open OnDemand: https://greatlakes8.arc-ts.umich.edu

Armis2:

  • Command line: ssh armis28-login.arc-ts.umich.edu
  • Open OnDemand: https://armis28.arc-ts.umich.edu

Lighthouse:

  • Command line: ssh lighthouse8-login.arc-ts.umich.edu
  • Open OnDemand: https://lighthouse8.arc-ts.umich.edu

How to get help 

  • Send an email to arc-support@umich.edu.
  • Attend a virtual, drop-in office hour session (CoderSpaces) to get hands-on help from experts, available 9:30-11 a.m. and 2-3:30 p..m. on Tuesdays; and 9:30-11 a.m. and 2-3:30 p.m. on Thursdays.  

Changes

This maintenance period will include significant updates including a change to the operating system, OFED drivers, NVIDIA drivers, and software and Slurm versions on all three clusters: Great Lakes, Armis2, and Lighthouse. See details below. 

System software changes

NEW version in BOLD OLD version
Red Hat 8.4
  • Kernel 4.18.0-305.25.1.el8_4
  • glibc 2.28-151
  • ucx 1.12.0-1.55103 (OFED provided)
  • gcc-8.4.1-1.el8
CentOS 7.9
  • kernel 3.10.0-1160.45.1
  • glibc 2.17-325.el7_9
  • ucx 1.11.1-1.54303 (OFED provided)
  • gcc-4.8.5-44.el7
Mlnx-ofa_kernel-modules 
  • OFED 5.5.1.0.3.1.
  • kver.4.18.0_305.25.1.el8_4
Mlnx-ofa_kernel-modules
  • OFED 5.4-3.0.3.0
    • kver.3.10.0-1160.45.1
Slurm 21.08.7 copiles with:
    • PMIx
      • /opt/pmix/2.2.5
      • /opt/pmix/3.2.3
      • /opt/pmix/4.1.2
  • hwloc 2.2.0-1 (OS provided)
  • ucx 1.12.0-1.55103 (OFED provided)
  • slurmrestd
  • slurm-libpmi
  • slurm-contribs
Slurm 21.08.4 compiles with:
  • PMIx
    • /opt/pmix/2.2.5
    • /opt/pmix/3.2.3
    • /opt/pmix/4.1.0
  • hwloc 1.11.8-4 (OS provided)
  • ucx 1.11.1 (OFED provided)
  • slurmrestd
  • slurm-libpmi
  • slurm-contribs
PMIx LD config /opt/pmix/2.2.5/lib PMIx LD config /opt/pmix/2.2.5/lib
PMIx versions available in /opt :
  • 2.2.5
  • 3.2.3
  • 4.1.2
PMIx versions available in /opt :
  • 2.2.5
  • 3.2.3 
  • 4.1.0
Singularity CE (Sylabs.io)
  • 3.9.8
Singularity (Sylabs.io)
  • 3.7.3
  • 3.8.4
NVIDIA driver 495.44 (Might change at the SW team request) NVIDIA driver 495.44
Open OnDemand TBD Open OnDemand 2.0.20

 

User software changes

New Software Versions  Deprecated versions (RIP)
Python
  • Version 3.6.8 is system provided; ARC will provide newer versions via modules
Python
  • 2.x

 

FAQ

Q: When is summer maintenance? 

A: Summer 2022 maintenance is happening August 8-10. The high-performance computing (HPC) clusters and their storage systems (/home and /scratch) will be unavailable:

  • Great Lakes: Monday, August 8 to Tuesday, August 9 
  • Armis2 and Lighthouse: Tuesday, August 9 to Wednesday, August 10 

 

Q: How should I prepare for the summer 2022 maintenance? 

A: There are a number of actions you can take ahead of maintenance, including: 

  • Use the preview clusters to test the code. 
  • Copy any needed files to your local drive using Globus File Transfer

 

Q: Can I run any of my jobs during maintenance? 

A: No. You can submit your jobs anytime; they should start as expected once maintenance has been completed, MATLAB and Python jobs.

 

Q: Will I have access to the clusters during the maintenance?

A: No. The clusters and their storage systems will be unavailable during maintenance, including files, jobs, and the command line. 

 

Q: Will there be any changes to my jobs after maintenance?

A: Use the preview cluster to recompile and test your code. You may need to recompile your code if you didn’t get a chance to recompile or test on the preview cluster prior to maintenance.

 

Q: Where can I get help?

A: You can send an email to arc-support@umich.edu. Or attend a virtual, drop-in office hour session (CoderSpaces) to get hands-on help from experts, available 9:30-11 a.m. and 2-3:30 p..m. on Tuesdays; and 9:30-11 a.m. and 2-3:30 p.m. on Thursdays. 

 

Q: How do I access the preview clusters? 

Great Lakes:

  • Command line: ssh greatlakes8-login.arc-ts.umich.edu
  • Open OnDemand: https://greatlakes8.arc-ts.umich.edu

Armis2:

  • Command line: ssh armis28-login.arc-ts.umich.edu
  • Open OnDemand: https://armis28.arc-ts.umich.edu

Lighthouse:

  • Command line: ssh lighthouse8-login.arc-ts.umich.edu
  • Open OnDemand: https://lighthouse8.arc-ts.umich.edu

 

Q: What are the technical changes happening during maintenance?

A: See the chart above. 

 

Q: Will Python version 2.x work after maintenance?

A: No. ARC is updating Python to version 3.6.8. Be sure to test and/or update your processes, codes, and/or libraries.