What is Great Lakes?
Great Lakes is an ARC-TS managed HPC cluster available to faculty (PIs) and their students/researchers. All computational work is scheduled via the Slurm resource manager and task scheduler. For detailed hardware information, see the configuration page. Great Lakes is not suitable for HIPAA or other sensitive data.
What forms do I need to fill out?
- The Principal Investigator (PI) needs to request a Slurm account, specifying users that can access the account, the people which can administer that account, and payment details.
- Each user given access to the account must request a user login. Please refer to the Great Lakes User Guide for additional steps and usage information.
How can I get a trial account on Great Lakes?
If you are a PI that hasn’t used Great Lakes before, you are eligible for a limited trial account. This account will have $150 worth of cluster time (see the rates page) and will be unable to run jobs after 1 month. If interested, please contact firstname.lastname@example.org specifying that you’d like a trial account with lists of users and admins.
Will my Turbo storage be available on Great Lakes?
Since Turbo is a storage service independent of Great Lakes, users that utilized Turbo on Flux will still be able to access their data on Great Lakes. The cost of Turbo will not change and no data needs to be transferred. If you have trouble accessing Turbo, please contact email@example.com.
How do I submit jobs using a web interface?
Great Lakes utilizes Open OnDemand to enable web-based job submission, manage the files in their home directory, view/delete active jobs, and open a web terminal session. Users can also use Matlab, Jupyter Notebooks, RStudio, and get a remote desktop.
You must be on campus or on the VPN to connect to Great Lakes OnDemand. For more information, see the OnDemand section of the Great Lakes User Guide.
How do I view the resource usage on my account?
To view TRES (Trackable RESource) utilization by user or account, use the following commands (substitute bold variables):
Shows TRES usage by all users on account during date range:
sreport cluster UserUtilizationByAccount start=mm/dd/yy end=mm/dd/yy account=test --tres type
Shows TRES usage by specified user(s) on account during date range:
sreport cluster UserUtilizationByAccount start=mm/dd/yy end=mm/dd/yy users=un1,un2 account=test --tres type
Lists users alphabetically along with TRES usage and total during date range:
sreport cluster AccountUtilizationByUser start=mm/dd/yy end=mm/dd/yy tree account=test --tres type
Possible TRES types:
To view disk usage and availability by user, type:
home-quota -u uniqname
For more reporting options, see the Slurm sreport documentation.
What is a “root (_root) account”?
Each PI or project has a collection of Slurm accounts which could be used for different purposes (e.g. different grants or focuses of research) with different users. These Slurm accounts are contained within the PI/project’s root account (e.g. research_root). For example:
These accounts can have different limits on them, and are also collectively limited for /scratch usage and overall cluster usage.
As a PI, how can I limit usage on my account?
Principal Investigators can request that CPU, GPU, memory, billing units, and walltime be limited per user or group of users on their account. For more information, see the Great Lakes policy documentation.
Limits must be requested by emailing firstname.lastname@example.org.
As a PI, can I purchase my own nodes for Great Lakes?
PIs may purchase hardware for use on the Lighthouse cluster by emailing email@example.com to develop a hardware plan. Lighthouse utilizes the same Slurm job scheduler and infrastructure as Great Lakes, but purchased nodes can be used exclusively by the PI’s group.
What does my job status mean?
When listing your submitted jobs with
squeue -u uniqname, the final column titled “NODELIST(REASON)” will give you the reason that the job is not running yet. The possible statuses are:
This job is waiting for the resources (CPUs, Memory, GPUs) it requested to become available. Resources become available when currently running jobs complete. The job with Resources in the NODELIST(REASON) column is the top priority job and should be started next.
This job is not the top priority, so it must wait in the queue until it becomes the top priority job. Once it becomes the top priority job, the NODELIST(REASON) column will change to “Resources”. The priority of all pending jobs can be shown with the
sprio command. A job’s priority is determined by two factors: fairshare and age. The fairshare factor in a job’s priority is influenced by the amount of resources that have been consumed by members of your Slurm account. More recent usage means a lower fairshare priority. The age factor is determined by the job’s queued time. The longer the job has been waiting in the queue, the higher the age priority.
This job was submitted with a Slurm account that has a limit set on the number of CPUs that may be used at one time. This limit is set for all jobs by all users of the same Slurm account. Once some of the jobs running under this Slurm account complete, this reason will change to Priority or Resources unless there is some other limit or dependency. All jobs running under a given Slurm account can be viewed by running
This job was submitted with a Slurm account that has a limit set on the number of GPUs that may be used at one time. This limit is set for all jobs by all users of the same Slurm account. Once some of the jobs running under this Slurm account complete, this reason will change to Priority or Resources unless there is some other limit or dependency. All jobs running under a given Slurm account can be viewed by running
This job was submitted with a Slurm account that has a limit set on the amount of memory that may be used at one time. This limit is set for all jobs by all users of the same Slurm account. Once some of the jobs running under this Slurm account complete, this reason will change to Priority or Resources unless there is some other limit or dependency. All jobs running under a given Slurm account can be viewed by running
This job was submitted with a Slurm account that has a limit set on the amount of monetary charges that may be accrued. Jobs that are pending with this reason will not start until the limit has been raised or the monthly bill has been processed.
This job has a dependency on another job. It will not start until that dependency is met. The most common dependency is waiting for another job to complete.
This job was submitted to the GPU partition, but did not request a GPU. This job will never start. This job should be deleted and resubmitted to a different partition or if a GPU is needed, resubmitted to the GPU partition with a GPU request. A GPU can be requested by adding the following line to a batch script:
How Can I Access On-Campus Restricted Software?
From the Command Line
Log into an on-campus login node via ssh client to gl-campus-login.arc-ts.umich.edu
From Open On-Demand
Open your browser (Firefox, Edge, or Chrome in an incognito tab – recommended) and navigate to greatlakes-oncampus.arc-ts.umich.edu.
What are the SSH pub keys for Great Lakes?
If you wish to pre-populate your SSH client configuration with the publicly available keys for Great Lakes, they are as follows:
greatlakes.arc-ts.umich.edu,greatlakes-oncampus.arc-ts.umich.edu,gl-login?.arc-ts.umich.edu,22.214.171.124,126.96.36.199,188.8.131.52,184.108.40.206 ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBHWel/rAXqIJYxexVzMSlgy/fICWukn8DaOGMPpAomH1E5AhCjrH2zMMTJHtXYsRA+brm/sTbn21Zw+pgpgJSYA=
greatlakes.arc-ts.umich.edu,greatlakes-oncampus.arc-ts.umich.edu,gl-login?.arc-ts.umich.edu,220.127.116.11,18.104.22.168,22.214.171.124,126.96.36.199 ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAICwaAq9LI48vVO4qbt35Xfz1pi+RE1Krq1iIeJQqoFEw
greatlakes.arc-ts.umich.edu,greatlakes-oncampus.arc-ts.umich.edu,gl-login?.arc-ts.umich.edu,188.8.131.52,184.108.40.206,220.127.116.11,18.104.22.168 ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAIEA16eDiBWF3SgPQXEeJsH8dsxO8x3o5KkdqWMg/lK57Kpwf4QGXJNvYy0jxSAuKTRim/ob6+nDRH8zIOwnl9tlyEw+8VN3WR8nqBqxX6Km2yzTOMO8Lh7fLuMTZHOdEz0uOn6tBP8LTMtHN9h/fANjKFVl8N+jsejMXrPf0w7jGjc=
On a Mac or Linux machine you’ll add the keys to your known_hosts file.
On a Mac this file is: /Users/<username>/.ssh/known_hosts.
On Linux: /home/<username>/.ssh/known_hosts.
The known_hosts file should have 644 (i.e. -rw-r--r--) permissions.
If you are using an SSH client that is not part of your operating system (e.g. Windows using PuTTY), please see the client documentation referring to host key verification.
A good start for PuTTY users can be found here (section A.2.9 “Is there an option to turn off the annoying host key prompts?)”
Can I get a refund on a failed job?
Any refunds (if any) are at the discretion of ARC and will only only be enacted during system-wide preventable issues. This does not include hardware failure, power failures, job failures, or similar issues. For more information, see the Great Lakes policies.
If you have a problem not listed here, please send an email to firstname.lastname@example.org.