HPC, storage now more accessible for researchers

By | HPC, News, Systems and Services

U-M Research Computing Package decorative image

Information and Technology Services has launched a new package of supercomputing resources for researchers and PhD students on all U-M campuses: the U-M Research Computing Package, provided by ITS.

The U-M Research Computing Package will reduce the current rates for high performance computing and research storage services provided by ITS by an estimated 35-40 percent, effective July 1. 

In addition, beginning Sept. 1, university researchers will have access to a base allocation for high-performance computing and research storage services (including high-speed and archival storage) at no cost, thanks to an additional investment from ITS. These base allocations will meet the needs of approximately 75 percent of current high-performance computing users and 90 percent of current research storage users.

Learn more about the U-M Research Computing Package

 

Advanced research computing on the Great Lakes Cluster

By |

OVERVIEW

This workshop will cover some more advanced topics in computing on the U-M Great Lakes Cluster. Topics to be covered include a review of common parallel programming models and basic use of Great Lakes; dependent and array scheduling; workflow scripting using bash; high-throughput computing using launcher; parallel processing in one or more of Python, R, and MATLAB; and profiling of parallel code using Allinea Performance Reports and Allinea MAP.

PRE-REQUISITES

This course assumes familiarity with the Linux command line as might be got from the CSCAR/ARC-TS workshop Introduction to the Linux Command Line. In particular, participants should understand how files and folders work, be able to create text files using the nano editor, be able to create and remove files and folders, and understand what input and output redirection are and how to use them.

INSTRUCTORS

Dr. Charles J Antonelli
Research Computing Services
LSA Technology Services

Charles is a member of the LSA Technology Services Research team at the University of Michigan, where he is responsible for high performance computing support and education, and was an Advocate to the Departments of History and Communications. Prior to this, he built a parallel data ingestion component of a novel earth science data assimilation system, a secure packet vault, and worked on the No. 5 ESS Switch at Bell Labs in the 80s. He has taught courses in operating systems, distributed file systems, C++ programming, security, and database application design.

John Thiels
Research Computing Services
LSA Technology Services

MATERIALS

COURSE PREPARATION

In order to participate successfully in the workshop exercises, you must have a user login, a Slurm account, and be enrolled in Duo. The user login allows you to log in to the cluster, create, compile, and test applications, and prepare jobs for submission. The Slurm account allows you to submit those jobs, executing the applications in parallel on the cluster and charging their resource use to the account. Duo is required to help authenticate you to the cluster.

USER LOGIN

If you already have a Great Lakes user login, you don’t need to do anything.  Otherwise, go to the Great Lakes user login application page at: http://arc-ts.umich.edu/login-request/ .

Please note that obtaining a user account requires human processing, so be sure to do this at least two business days before class begins.

SLURM ACCOUNT

We create a Slurm account for the workshop so you can run jobs on the cluster during the workshop and for one day after for those who would like additional practice. The workshop job account is quite limited and is intended only to run examples to help you cement the details of job submission and management. If you already have an existing Slurm account, you can use that, though if there are any issues with that account, we will ask you to use the workshop account.

DUO AUTHENTICATION

Duo two-factor authentication is required to log in to the cluster. When logging in, you will need to type your UMICH (AKA Level 1) password as well as authenticate through Duo in order to access Great Lakes.

If you need to enroll in Duo, follow the instructions at Enroll a Smartphone or Tablet in Duo.

Please enroll in Duo before you come to class.

LAPTOP PREPARATION

You will need VPN software to access the U-M network.  If you do not have VPN software already installed, please download and install the Cisco AnyConnect VPN software following these instructions.  You will need VPN to be able to use the ssh client to connect to Great Lakes. Please use the ‘Campus All traffic’ profile in the Cisco client.

You will need an ssh client to connect to the Great Lakes cluster. Mac OS X and Linux platforms have this built-in. Here are a couple of choices for Windows platforms:

  • Download and install U-M PuTTY/WinSCP from the Compute at the U website. This includes both the PuTTY ssh client and terminal emulator and a graphical file transfer tool in one installer.  This document describes how to download and use this software, except please note you will be connecting to greatlakes.arc-ts.umich.edu instead of the cited host.  You must have administrative authority over your computer to install this software.
  • Download PuTTY directly from the developer. Download the putty.exe application listed under “Alternative binary files,”, then execute the application.  You do not need administrative authority over your computer to use this software.

Our Great Lakes User Guide in Section 1.2 describes in more detail how to use PuTTY to connect to Great Lakes.

Please prepare and test your computer’s ability to make remote connections before class; we cannot stop to debug connection issues during the class.

A Zoom link will be provided to the participants the day before the class. Registration is required.Please note this session will be recorded.

 

Please register at https://ttc.iss.lsa.umich.edu/ttc/sessions/advanced-research-computing-on-the-great-lakes-cluster-10/register/

Introduction to Research Computing on the Great Lakes Cluster

By |

OVERVIEW

This workshop will introduce you to high performance computing on the Great Lakes cluster.  After a brief overview of the components of the cluster and the resources available there, the main body of the workshop will cover creating batch scripts and the options available to run jobs, and hands-on experience in submitting, tracking, and interpreting the results of submitted jobs. By the end of the workshop, every participant should have created a submission script, submitted a job, tracked its progress, and collected its output. Additional tools including high-performance data transfer services and interactive use of the cluster will also be covered.

PRE-REQUISITES

This course assumes familiarity with the Linux command line as might be obtained from the ARC-TS workshop Introduction to the Linux Command Line. In particular, participants should understand how files and folders work, be able to create text files using the nano editor, and be able to create and remove files and folders.  Some exposure to shell input and output redirection and pipes would also be useful.

INSTRUCTORS

Dr. Charles J Antonelli
Research Computing Services
LSA Technology Services

Charles is a member of the LSA Technology Services Research team at the University of Michigan, where he is responsible for high performance computing support and education, and was an Advocate to the Departments of History and Communications. Prior to this, he built a parallel data ingestion component of a novel earth science data assimilation system, a secure packet vault, and worked on the No. 5 ESS Switch at Bell Labs in the 80s. He has taught courses in operating systems, distributed file systems, C++ programming, security, and database application design.

MATERIALS

COURSE PREPARATION

In order to participate successfully in the workshop exercises, you must have a user login, a Slurm account, and be enrolled in Duo. The user login allows you to log in to the cluster, create, compile, and test applications, and prepare jobs for submission. The Slurm account allows you to submit those jobs, executing the applications in parallel on the cluster and charging their resource use to the account. Duo is required to help authenticate you to the cluster.

USER LOGIN

If you already have a Great Lakes user login, you don’t need to do anything.  Otherwise, go to the Great Lakes user login application page at: http://arc-ts.umich.edu/login-request/

Please note that obtaining a user account requires human processing, so be sure to do this at least two business days before class begins.

SLURM ACCOUNT

We create a Slurm account for the workshop so you can run jobs on the cluster during the workshop and for one day after for those who would like additional practice. The workshop job account is quite limited and is intended only to run examples to help you cement the details of job submission and management. If you already have an existing Slurm account, you can use that, though if there are any issues with that account, we will ask you to use the workshop account.

DUO AUTHENTICATION

Duo two-factor authentication is required to log in to the cluster. When logging in, you will need to type your UMICH (AKA Level 1) password as well as authenticate through Duo in order to access Great Lakes.

If you need to enroll in Duo, follow the instructions at Enroll a Smartphone or Tablet in Duo.

Please enroll in Duo before you come to class.

LAPTOP PREPARATION

You will need VPN software to access the U-M network on which Great Lakes is located.  If you do not have VPN software already installed, please download and install the Cisco AnyConnect VPN software following these instructions.  Please use the ‘Campus All traffic’ profile in the Cisco client.

You will need an ssh client to connect to the Great Lakes cluster. Mac OS X and Linux platforms have this built-in. Here are a couple of choices for Windows platforms:

  • Download and install U-M PuTTY/WinSCP from the Compute at the U website. This includes both the PuTTY ssh client and terminal emulator and a graphical file transfer tool in one installer.  This document describes how to download and use this software, except please note you will be connecting to greatlakes.arc-ts.umich.edu instead of the cited host.  You must have administrative authority over your computer to install this software.
  • Download PuTTY directly from the developer. Download the putty.exe application listed under “Alternative binary files,”, then execute the application.  You do not need administrative authority over your computer to use this software.

Our Great Lakes User Guide in Section 1.2 describes in more detail how to use PuTTY to connect to Great Lakes.

Please prepare and test your computer’s ability to make remote connections before class; we cannot stop to debug connection issues during the class.

A Zoom link will be provided to the participants the day before the class. Registration is required.  Please note this session will be recorded.

 

If you have questions about this workshop, please send an email to the instructors at hpc-course@umich.edu

Please register at https://ttc.iss.lsa.umich.edu/ttc/sessions/introduction-to-research-computing-on-the-great-lakes-cluster-8/register/

Advanced research computing on the Great Lakes Cluster

By |

OVERVIEW

This workshop will cover some more advanced topics in computing on the U-M Great Lakes Cluster. Topics to be covered include a review of common parallel programming models and basic use of Great Lakes; dependent and array scheduling; workflow scripting using bash; high-throughput computing using launcher; parallel processing in one or more of Python, R, and MATLAB; and profiling of parallel code using Allinea Performance Reports and Allinea MAP.

PRE-REQUISITES

This course assumes familiarity with the Linux command line as might be got from the CSCAR/ARC-TS workshop Introduction to the Linux Command Line. In particular, participants should understand how files and folders work, be able to create text files using the nano editor, be able to create and remove files and folders, and understand what input and output redirection are and how to use them.

INSTRUCTORS

Dr. Charles J Antonelli
Research Computing Services
LSA Technology Services

Charles is a member of the LSA Technology Services Research team at the University of Michigan, where he is responsible for high performance computing support and education, and was an Advocate to the Departments of History and Communications. Prior to this, he built a parallel data ingestion component of a novel earth science data assimilation system, a secure packet vault, and worked on the No. 5 ESS Switch at Bell Labs in the 80s. He has taught courses in operating systems, distributed file systems, C++ programming, security, and database application design.

John Thiels
Research Computing Services
LSA Technology Services

MATERIALS

COURSE PREPARATION

In order to participate successfully in the workshop exercises, you must have a user login, a Slurm account, and be enrolled in Duo. The user login allows you to log in to the cluster, create, compile, and test applications, and prepare jobs for submission. The Slurm account allows you to submit those jobs, executing the applications in parallel on the cluster and charging their resource use to the account. Duo is required to help authenticate you to the cluster.

USER LOGIN

If you already have a Great Lakes user login, you don’t need to do anything.  Otherwise, go to the Great Lakes user login application page at: http://arc-ts.umich.edu/login-request/ .

Please note that obtaining a user account requires human processing, so be sure to do this at least two business days before class begins.

SLURM ACCOUNT

We create a Slurm account for the workshop so you can run jobs on the cluster during the workshop and for one day after for those who would like additional practice. The workshop job account is quite limited and is intended only to run examples to help you cement the details of job submission and management. If you already have an existing Slurm account, you can use that, though if there are any issues with that account, we will ask you to use the workshop account.

DUO AUTHENTICATION

Duo two-factor authentication is required to log in to the cluster. When logging in, you will need to type your UMICH (AKA Level 1) password as well as authenticate through Duo in order to access Great Lakes.

If you need to enroll in Duo, follow the instructions at Enroll a Smartphone or Tablet in Duo.

Please enroll in Duo before you come to class.

LAPTOP PREPARATION

You will need VPN software to access the U-M network.  If you do not have VPN software already installed, please download and install the Cisco AnyConnect VPN software following these instructions.  You will need VPN to be able to use the ssh client to connect to Great Lakes. Please use the ‘Campus All traffic’ profile in the Cisco client.

You will need an ssh client to connect to the Great Lakes cluster. Mac OS X and Linux platforms have this built-in. Here are a couple of choices for Windows platforms:

  • Download and install U-M PuTTY/WinSCP from the Compute at the U website. This includes both the PuTTY ssh client and terminal emulator and a graphical file transfer tool in one installer.  This document describes how to download and use this software, except please note you will be connecting to greatlakes.arc-ts.umich.edu instead of the cited host.  You must have administrative authority over your computer to install this software.
  • Download PuTTY directly from the developer. Download the putty.exe application listed under “Alternative binary files,”, then execute the application.  You do not need administrative authority over your computer to use this software.

Our Great Lakes User Guide in Section 1.2 describes in more detail how to use PuTTY to connect to Great Lakes.

Please prepare and test your computer’s ability to make remote connections before class; we cannot stop to debug connection issues during the class.

A Zoom link will be provided to the participants the day before the class. Registration is required.Please note this session will be recorded.

 

Please register at https://ttc.iss.lsa.umich.edu/ttc/sessions/advanced-research-computing-on-the-great-lakes-cluster-9/register/

Introduction to the Linux Command Line

By |

OVERVIEW

This course will familiarize the student with the basics of accessing and interacting with Linux computers using the GNU/Linux operating system’s Bash shell, also generically referred to as “the command line”. Topics include: a brief overview of Linux, the Bash shell, navigating the file system, basic commands, shell redirection, permissions, processes, and the command environment. The workshop will also provide a quick introduction to nano a simple text editor that will be used in subsequent workshops to edit files.

PRE-REQUISITES

None.

INSTRUCTOR

Kenneth Weiss
IT Project Senior Manager

HITS Academic IT – HPC team

Ken is a High Performance Computing Consultant with the Health Information Technology & Services (HITS) Academic IT – HPC team at the University of Michigan. He works with a team of IT specialists to provide high performance computing support and training for the Medical School. Prior to this, he spent 21 years managing research computing, including an HPC cluster, for Dr. Charles Sing in the Human Genetics Department.

MATERIALS

COURSE PREPARATION

You will need VPN software to access the U-M network.  If you do not have VPN software already installed, please download and install the Cisco AnyConnect VPN software following these instructions.  You will need VPN to be able to use the ssh client to connect to training host. Please use the ‘Campus All traffic’ profile in the Cisco client.

You will need an ssh client to connect to the training host. Mac OS X and Linux platforms have this built-in. Here are a couple of choices for Windows platforms:

  • Download and install U-M PuTTY/WinSCP from the Compute at the U website. This includes both the PuTTY ssh client and terminal emulator and a graphical file transfer tool in one installer.  This document describes how to download and use this software.  A demonstration of this software will be given during class.  You must have administrative authority over your computer to install this software.
  • Download PuTTY directly from the developer. Download the putty.exe application listed under “Alternative binary files,”, then execute the application.  You do not need administrative authority over your computer to use this software.

During class you will be given the name of the training host to be able to participate in the hands-on activities.

A Zoom link will be provided to the participants the day before the class. Registration is required.  Please note, this session will be recorded.  

If you have questions about this workshop, please send an email to the instructor at kgweiss@umich.edu

Register at https://ttc.iss.lsa.umich.edu/ttc/sessions/introduction-to-the-linux-command-line-31/register/

Introduction to Research Computing on the Great Lakes Cluster

By |

OVERVIEW

This workshop will introduce you to high performance computing on the Great Lakes cluster.  After a brief overview of the components of the cluster and the resources available there, the main body of the workshop will cover creating batch scripts and the options available to run jobs, and hands-on experience in submitting, tracking, and interpreting the results of submitted jobs. By the end of the workshop, every participant should have created a submission script, submitted a job, tracked its progress, and collected its output. Additional tools including high-performance data transfer services and interactive use of the cluster will also be covered.

PRE-REQUISITES

This course assumes familiarity with the Linux command line as might be obtained from the ARC-TS workshop Introduction to the Linux Command Line. In particular, participants should understand how files and folders work, be able to create text files using the nano editor, and be able to create and remove files and folders.  Some exposure to shell input and output redirection and pipes would also be useful.

INSTRUCTORS

Dr. Charles J Antonelli
Research Computing Services
LSA Technology Services

Charles is a member of the LSA Technology Services Research team at the University of Michigan, where he is responsible for high performance computing support and education, and was an Advocate to the Departments of History and Communications. Prior to this, he built a parallel data ingestion component of a novel earth science data assimilation system, a secure packet vault, and worked on the No. 5 ESS Switch at Bell Labs in the 80s. He has taught courses in operating systems, distributed file systems, C++ programming, security, and database application design.

MATERIALS

COURSE PREPARATION

In order to participate successfully in the workshop exercises, you must have a user login, a Slurm account, and be enrolled in Duo. The user login allows you to log in to the cluster, create, compile, and test applications, and prepare jobs for submission. The Slurm account allows you to submit those jobs, executing the applications in parallel on the cluster and charging their resource use to the account. Duo is required to help authenticate you to the cluster.

USER LOGIN

If you already have a Great Lakes user login, you don’t need to do anything.  Otherwise, go to the Great Lakes user login application page at: http://arc-ts.umich.edu/login-request/

Please note that obtaining a user account requires human processing, so be sure to do this at least two business days before class begins.

SLURM ACCOUNT

We create a Slurm account for the workshop so you can run jobs on the cluster during the workshop and for one day after for those who would like additional practice. The workshop job account is quite limited and is intended only to run examples to help you cement the details of job submission and management. If you already have an existing Slurm account, you can use that, though if there are any issues with that account, we will ask you to use the workshop account.

DUO AUTHENTICATION

Duo two-factor authentication is required to log in to the cluster. When logging in, you will need to type your UMICH (AKA Level 1) password as well as authenticate through Duo in order to access Great Lakes.

If you need to enroll in Duo, follow the instructions at Enroll a Smartphone or Tablet in Duo.

Please enroll in Duo before you come to class.

LAPTOP PREPARATION

You will need VPN software to access the U-M network on which Great Lakes is located.  If you do not have VPN software already installed, please download and install the Cisco AnyConnect VPN software following these instructions.  Please use the ‘Campus All traffic’ profile in the Cisco client.

You will need an ssh client to connect to the Great Lakes cluster. Mac OS X and Linux platforms have this built-in. Here are a couple of choices for Windows platforms:

  • Download and install U-M PuTTY/WinSCP from the Compute at the U website. This includes both the PuTTY ssh client and terminal emulator and a graphical file transfer tool in one installer.  This document describes how to download and use this software, except please note you will be connecting to greatlakes.arc-ts.umich.edu instead of the cited host.  You must have administrative authority over your computer to install this software.
  • Download PuTTY directly from the developer. Download the putty.exe application listed under “Alternative binary files,”, then execute the application.  You do not need administrative authority over your computer to use this software.

Our Great Lakes User Guide in Section 1.2 describes in more detail how to use PuTTY to connect to Great Lakes.

Please prepare and test your computer’s ability to make remote connections before class; we cannot stop to debug connection issues during the class.

A Zoom link will be provided to the participants the day before the class. Registration is required.  Please note this session will be recorded.

 

If you have questions about this workshop, please send an email to the instructors at hpc-course@umich.edu

Please register at https://ttc.iss.lsa.umich.edu/ttc/sessions/introduction-to-research-computing-on-the-great-lakes-cluster-6/register/

Introduction to the Linux Command Line

By |

OVERVIEW

This course will familiarize the student with the basics of accessing and interacting with Linux computers using the GNU/Linux operating system’s Bash shell, also generically referred to as “the command line”. Topics include: a brief overview of Linux, the Bash shell, navigating the file system, basic commands, shell redirection, permissions, processes, and the command environment. The workshop will also provide a quick introduction to nano a simple text editor that will be used in subsequent workshops to edit files.

PRE-REQUISITES

None.

INSTRUCTOR

Kenneth Weiss
IT Project Senior Manager

HITS Academic IT – HPC team

Ken is a High Performance Computing Consultant with the Health Information Technology & Services (HITS) Academic IT – HPC team at the University of Michigan. He works with a team of IT specialists to provide high performance computing support and training for the Medical School. Prior to this, he spent 21 years managing research computing, including an HPC cluster, for Dr. Charles Sing in the Human Genetics Department.

MATERIALS

COURSE PREPARATION

You will need VPN software to access the U-M network.  If you do not have VPN software already installed, please download and install the Cisco AnyConnect VPN software following these instructions.  You will need VPN to be able to use the ssh client to connect to training host. Please use the ‘Campus All traffic’ profile in the Cisco client.

You will need an ssh client to connect to the training host. Mac OS X and Linux platforms have this built-in. Here are a couple of choices for Windows platforms:

  • Download and install U-M PuTTY/WinSCP from the Compute at the U website. This includes both the PuTTY ssh client and terminal emulator and a graphical file transfer tool in one installer.  This document describes how to download and use this software.  A demonstration of this software will be given during class.  You must have administrative authority over your computer to install this software.
  • Download PuTTY directly from the developer. Download the putty.exe application listed under “Alternative binary files,”, then execute the application.  You do not need administrative authority over your computer to use this software.

During class you will be given the name of the training host to be able to participate in the hands-on activities.

A Zoom link will be provided to the participants the day before the class. Registration is required.  Please note, this session will be recorded.  

If you have questions about this workshop, please send an email to the instructor at kgweiss@umich.edu

Register at https://ttc.iss.lsa.umich.edu/ttc/sessions/introduction-to-the-linux-command-line-30/register/

Using tweets to understand climate change sentiment

By | HPC, News, Research, Systems and Services

A team from Urban Sustainability Research Group of the School for Environment and Sustainability (UM-SEAS) has been studying public tweets to understand climate change and global warming attitudes in the U.S. 

Dimitris Gounaridis, is a fellow with the study. The team is mentored by Joshua Newell, and combines work about perceptions on climate change by Jianxun Yang and proprietary level vulnerability assessment by Wanja Waweru

“This research is timely and urgent. It helps us identify hazards, and elevated risks of flooding and heat, for socially vulnerable communities across the U.S. This risk is exacerbated especially for populations that do not believe climate change is happening,” Dimitris stated. 

The research team used a deep learning algorithm that is able to recognize text and predict whether the person tweeting believes in climate change or not. The algorithm analyzed a total of 7 million public tweets from a combination of datasets from a dataset called the U-M Twitter Decahose and the George Washington University Libraries Dataverse. This dataset consists of an historical archive of Decahose tweets and an ongoing collection from the Decahose. The current deep learning model has an 85% accuracy rate and is validated at multiple levels.

The map below shows the prediction of specific users that believe or are skeptical of climate change and global warming. Dimitris used geospatial modeling techniques to identify clusters of American skepticism and belief to create the map.

A map of the United States with blue and red dots indicating climate change acceptance.

(Image courtesy Dimitris Gounaridis.)

The tweet stream is sampled in real-time. Armand Burks, a research data scientist with ARC, wrote the Python code that is responsible for continuously collecting the data and storing it in Turbo Research Storage. He says that many researchers across the university are using this data for various research projects as well as classes. 

“We are seeing an increased demand for shared community data sets like the Decahose. ARC’s platforms like Turbo, ThunderX, and Great Lakes, hold and process that data, and our data scientists are available, in partnership with CSCAR, to assist in deriving meaning from such large data. 

“This is proving to be an effective way to combine compute services, methodology, and campus research mission leaders to make an impact quickly,” said Brock Palen, director of ARC.

In the future, Dimitris plans to refine the model to increase its accuracy, and then combine that with climate change vulnerability for flooding and heat stress.

“MIDAS is pleased that so many U-M faculty members are interested in using the Twitter Decahose. We currently have over 40 projects with faculty in the Schools of Information, Kinesiology, Social Work, and Public Health, as well as at Michigan Ross, the Ford School, LSA and more,” said H.V. Jagadish, MIDAS director and professor of Electrical Engineering and Computer Science

The Twitter Decahose is co-managed and supported by MIDAS, CSCAR, and ARC, and is available to all researchers without any additional charge. For questions about the Decahose, email Kristin Burgard, MIDAS outreach and partnership manager.

Advanced ML topics: Algorithms, writing ML code, comparing implementations

By |

OVERVIEW

This workshop is designed as a follow-up to the basic introduction to machine learning earlier in this series. We will cover several examples in Python and compare different implementations. We will also look at advanced topics in machine learning, such as GPU optimization, parallel processing, and deep learning. A basic understanding of Python is required.

INSTRUCTORS

Meghan Richey
Machine Learning Specialist
Information and Technology Services – Advanced Research Computing – Technology Services

Meghan Richey is a machine learning specialist in the Advanced Research Computing- Technology Services department at the University of Michigan. She consults on several faculty and student machine learning applications and research studies, specializing in natural language processing and convolutional neural networks. Before her position at the university, Ms. Richey worked for a defense contractor as a software engineer to design and implement software solutions for DoD-funded artificial intelligence efforts.

A Zoom link will be provided to the participants the day before the class. Registration is required.

Instructor will be available at the Zoom link, to be provided, from 9-10 AM for computer setup assistance.

Please note, this session will be recorded.  

Register here

If you have questions about this workshop, please send an email to the instructor at richeym@umich.edu

ARC, LSA support groundbreaking global energy tracking

By | General Interest, Great Lakes, HPC, News, Research, Uncategorized

How can technology services like high-performance computing and storage help a political scientist contribute to more equal access to electricity around the world? 

Brian Min, associate professor of political science and research associate professor with the Center for Political Studies, and lead researcher Zachary O’Keeffe have been using nightly satellite imagery to generate new indicators of electricity access and reliability across the world as part of the High-Resolution Electricity Access (HREA) project. 

The collection of satellite imagery is unique in its temporal and spatial coverage. For more than three decades, images have captured nighttime light output over every corner of the globe, every single night. By studying small variations in light output over time, the goal is to identify patterns and anomalies to determine if an area is electrified, when it got electrified, and when the power is out. This work yields the highest resolution estimates of energy access and reliability anywhere in the world.

A satellite image of Kenya in 2017

This image of Kenya from 2017 shows a model-based classification of electrification status based upon all night statistically recalibrated 2017 VIIRS light output. (Image courtesy Dr. Min. Sources: NOAA, VIIRS DNB, Facebook/CIESIN HRSL).

LSA Technology Services and ARC both worked closely with Min’s team to relieve pain points and design highly-optimized, automated workflows. Mark Champe, application programmer/analyst senior, LSA Technology Services, explained that, “a big part of the story here is finding useful information in datasets that were created and collected for other purposes. Dr. Min is able to ask these questions because the images were previously captured, and then it becomes the very large task of finding a tiny signal in a huge dataset.”

There are more than 250 terabytes of satellite imagery and data, across more than 3 million files. And with each passing night, the collection continues to grow. Previously, the images were not easily accessible because they were archived in deep storage in multiple locations. ARC provides processing and storage at a single place, an important feature for cohesive and timely research. 

The research team created computational models that run on the Great Lakes High-Performance Computing Cluster, and that can be easily replicated and validated. They archive the files on the Locker Large-File Storage service

One challenge Min and O’Keeffe chronically face is data management. Images can be hundreds of megabytes each, so just moving files from the storage service to the high-performance computing cluster can be challenging, let alone finding the right storage service. Using Turbo Research Storage and Globus File Transfer, Min and O’Keeffe found secure, fast, and reliable solutions to easily manage their large, high-resolution files.

Brock Palen, director of ARC, said that top speeds were reached when moving files from Great Lakes to Turbo at 1,400 megabytes per second. 

Min and team used Globus extensively in acquiring historical data from the National Oceanic and Atmospheric Administration (NOAA). Champe worked with the research team to set up a Globus connection to ARC storage services. The team at NOAA was then able to push the data to U-M quickly and efficiently. Rather than uploading the data to later be downloaded by Min’s team, Globus streamlined and sped up the data transfer process. 

Champe noted, “Over 100TB of data was being unarchived from tape and transferred between institutions. Globus made that possible and much less painful to manage.”

“The support we’ve gotten from ARC and LSA Technology has been incredible. They have made our lives easier by removing bottlenecks and helping us see new ways to draw insights from this unique data,” said Min. 

Palen added, “We are proud to partner with LSA Technology Services and ITS Infrastructure networking services to provide support to Dr. Min’s and O’Keeffe’s work. Their work has the potential to have a big impact in communities around the world.” 

“We should celebrate work such as this because it is a great example of impactful research done at U-M that many people helped to support,” Champe continued.

Min expressed his gratitude to the project’s partners. “We have been grateful to work with the World Bank and NOAA to generate new insights on energy access that will hopefully improve lives around the world.”

These images are now available via open access (free and available to all)

This is made possible by a partnership between the University of Michigan, the World Bank, Amazon Web Services, and NOAA