Category

Great Lakes

Globus maintenance happening at 9 a.m. on March 11

By | Armis2, Data, General Interest, Great Lakes, HPC, News, Research, Uncategorized

Due to planned maintenance by the vendor, Globus services will be unavailable for up to two hours beginning at 9 a.m. U.S. Eastern Time (10 a.m. Central Time) on Saturday, March 11, 2023.

Customers will not be able to authenticate or initiate any transfers during that time. Any transfers that have started before the outage will be stalled until the outage is over. Transfers will resume once maintenance is complete.

More details are available on the Globus blog.

For assistance or questions, please contact ARC at arc-support@umich.edu.

Protein structure prediction team achieved top rankings

By | Great Lakes, News, Uncategorized

CASP15 is a bi-annual competition assessment of methods of protein structure modeling. Independent assessors then compared the models with experiments, and the results and their implications were discussed at the CASP15 Conference, held December 2022, in Turkey.

A joint team with members from the labs of Dr. Peter Freddolino and Dr. Yang Zhang took first place in the Multimer and Interdomain Prediction categories, and was again the top-ranked server in the Regular (domains) category according to the CASP assessor’s criteria.

These wins are well-earned. Freddolino noted, “This is a highly competitive event, against some of the very best minds and powerful companies in the world.”

The Zhang/Freddolino team competed against nearly 100 other groups which include other academic institutions, as well as major cloud and commercial companies. Groups from around the world submitted more than 53,000 models on 127 modeling targets in 5 prediction categories. 

“Wei’s predictions did amazingly well in CASP15!,” said Freddolino. Wei Zheng, Ph.D., is a lab member and a research fellow with the Department of Computational Medicine and Bioinformatics (DCMB). 

Zheng said that the team participates in the regular protein structure prediction and protein complex structure prediction categories. “The results are assessed as regular protein domain modeling, regular protein inter-domain modeling, and protein complex modeling. In all categories, our models performed very well!” 

The technology that supported this impressive work 

The resources to achieve these results were grant-funded, which allowed the team to leverage a number of university resources, including:  

  • The Lighthouse High-Performance Computing Cluster (HPC) service. Lighthouse is managed by the Advanced Research Computing (ARC) team, and ARC is a division of Information and Technology Services (ITS). 
  • The algorithms were GPU-intensive and run on the Great Lakes HPC Cluster. Graphics processing units (GPUs) are specialized processors designed to accelerate graphics rendering. The Great Lakes cluster provided additional space for running compute cycles. Kenneth Weiss, IT project manager senior with DCMB and HITS, said that many of the algorithms used by Zheng benefited from the increased performance of being able to compute the data on a GPU.
  • Multiple storage systems, including Turbo Research Storage. High-speed storage was crucial for storing AI-trained models and sequence libraries used by the methods developed by Zhang, Freddolino, and Zheng called D-I-TASSER/DMFold-Multimer. 
  • Given the scale of the CASP targets, the grant-funded compute augmented capacity by utilizing the Great Lakes cluster, Freddolino and his team took advantage of the allocations provided by the ITS U-M Research Computing Package (UMRCP) and the HITS Michigan Medicine Research Computing Investment (MMRCI) programs which defrayed the cost of computing substantially.
  • The collaboration tool Slack was used to keep Freddolino and Zheng in close contact with ARC and the DCMB teams. This provided the ability to deal with issues promptly, avoiding delays that would have had a detrimental impact on meeting CASP targets.

Technology staff from ARC, DCMB, and Health Information and Technology Services (HITS) provided assistance to the research team. All of the teams helped with the mitigation of bottlenecks that affected speed and throughput that Zheng needed for results. Staff also located and helped leverage resources including those on Great Lakes, utilizing available partitions and queues on the clusters.

“Having the flexibility and capacity provided by Great Lakes was instrumental in meeting competition deadlines,” said Weiss.

DCMB staff and the HITS HPC Teams team took the lead on triaging software problems giving Freddolino’s group high priority.

ARC Director Brock Palen provided monitoring and guidance on real-time impact and utilization of resources. “It was an honor to support this effort. It has always been ARC’s goal to take care of the technology so researchers can do what they do best. In this case, Freddelino and Zheng knocked it out of the park.” 

Jonathan Poisson, technical support manager with DCMB, was instrumental in helping to select and configure the equipment purchased by the grant. “This assistance was crucial in meeting the tight CASP15 targets, as each target is accompanied by a deadline for results.” 

Read more on the Computational Medicine and Bioinformatics website and the Department of Biological Chemistry website.

Related presentation: D-I-TASSER: Integrating Deep Learning with Multi-MSAs and Threading Alignments for Protein Structure Prediction

The resources to achieve these results were provided by an NIH-funded grant (“High-Performance Computing Cluster for Biomedical Research,” SIG: S10OD026825). 

2023 Winter Maintenance & Globus File Transfer upgrade 

By | Feature, General Interest, Great Lakes, HPC, News, Systems and Services

Winter maintenance is coming up! See the details below. Reach out to arc-support@umich.edu with questions or if you need help. 

These services will be unavailable: 

  • Great Lakes – We will be updating Great Lakes on a rolling basis throughout December and beginning of January, and if successful, there should be no downtime or impact, with the following exceptions: 
    • Single precision GPUs (SPGPU) will be down Jan. 4-5 for networking maintenance. Those nodes will return back to production when maintenance has been completed and the nodes have been reloaded.
    • Customers will be notified via email of any changes to Great Lakes maintenance that will require downtime.
    • If unsuccessful, the Great Lakes maintenance will begin on Jan. 4-5, starting at 8am.  In either case, we will email everyone with the updated maintenance status.
  • Globus on the storage transfer nodes: Jan. 17-18.

Maintenance notes:

  • No downtime for ARC storage systems maintenance (Turbo, Locker, and Data Den).
  • Open OnDemand (OOD) users will need to re-login. Any existing jobs will continue to run and can be reconnected in the OOD portal.
  • Login servers will be updated, and the maintenance should not have any effect on most users. Those who are affected will be contacted directly by ARC. 
  • Copy any data and files that may be needed during maintenance to your local drive using Globus File Transfer before maintenance begins. 
  • Slurm email will be improved, providing  more detailed information about completed jobs.

Countdown to maintenance 

For Great Lakes HPC jobs, use the command “maxwalltime” to discover the amount of time remaining until maintenance begins. 

Jobs that request more walltime than remains until maintenance will automatically be queued and start once maintenance is complete. If the plan for Great Lakes maintenance is successful, any queued jobs will be able to run as usual (except for the SPGPU nodes as discussed above). Customers will be notified via email if downtime is required for Great Lakes.

Status updates and additional information

How can we help you?

For assistance or questions, please contact ARC at arc-support@umich.edu.

Precision Health and ARC team up on a self-service tool for genetic research

By | Great Lakes, HPC, News

Encore is a self-serve genetic analysis tool that researchers can now run using a point-and-click interface without the need to directly manipulate the genetic data. Only a phenotype file is needed to build a GWAS model with SAIGE (genetics analysis software), launch and monitor job progress, and interactively explore results.

It is geared for a range of disciplines and specialties including biostatistics, epidemiology, neuroscience, gastroenterology, anesthesiology, clinical pharmacy, and bioinformatics.

The tool was developed at the U-M School of Public Health Center for Statistical Genetics and is managed by Precision Health and supported by ITS’s Advanced Research Computing (ARC).  

Brock Palen, ARC director, “When someone uses Encore they are actually running on Great Lakes, and we are happy to provide the computational performance behind Encore.”

Using Encore is easy. No coding, command-line/Linux knowledge is required to run GWAS in Encore. Researchers also do not need to have knowledge of batch job submission or scheduling, or have direct access to a high-performance computing cluster. Encore automatically prepares job submission scripts and submits the analysis to the Great Lakes High-Performance Computing Cluster. 

Great Lakes is the university’s flagship open-science high-performance computing cluster. It is much faster and more powerful than a laptop, and provides quicker answers and optimized support for simulation, genomics, machine learning, life science, and more. The platform provides a balanced combination of computing power, I/O performance, storage capability, and accelerators.

Visit the Encore wiki page to learn more

To get started, send an email to PHDataHelp@umich.edu

For questions about Great Lakes, contact arc-support@umich.edu

Understanding the strongest electromagnetic fields in the universe

By | Data, Great Lakes, HPC, Research, Uncategorized

Alec Thomas is part of the team from the U-M College of Engineering Gérard Mourou Center for Ultrafast Optical Science that is building the most powerful laser in the U.S.

Dubbed “ZEUS,” the laser will be 3-petawatts of power. That’s a ‘3’ with 15 zeros. All the power generated in the entire world is 10-terawatts, or 1000 times less than the ZEUS laser. 

The team’s goal is to use the laser to explore how matter behaves in the most extreme electric and magnetic fields in the universe, and also to generate new sources of radiation beams, which may lead to developments in medicine, materials science, and national security. 

A simulation of a plasma wake.

This simulation shows a plasma wake behind a laser pulse. The plasma behaves like water waves generated behind a boat. In this image, the “waves” are extremely hot plasma matter, and the “boat” is a short burst of powerful laser light. (Image courtesy of Daniel Seipt.)

“In the strong electric fields of a petawatt laser, matter becomes ripped apart into a `plasma,’ which is what the sun is made of. This work involves very complex and nonlinear physical interactions between matter particles and light. We create six-dimensional models of particles to simulate how they might behave in a plasma in the presence of these laser fields to learn how to harness it for new technologies. This requires a lot of compute power,” Thomas said. 

That compute power comes from the Great Lakes HPC cluster, the university’s fastest high-performance computing cluster. The team created equations to solve a field of motion for each six-dimensional particle. The equations run on Great Lakes and help Thomas and his team to learn how the particle might behave within a cell. Once the field of motion is understood, solutions can be developed. 

“On the computing side, this is a very complex physical interaction. Great Lakes is designed to handle this type of work,” said Brock Palen, director of Advanced Research Computing, a division of Information and Technology Services. 

Thomas has signed up for allocations on the Great Lakes HPC cluster and Data Den storage. “I just signed up for the no-cost allocations offered by the U-M Research Computing Package. I am planning to use those allocations to explore ideas and concepts in preparation for submitting grant proposals.”

Learn more and sign up for the no-cost U-M Research Computing Package (UMRCP).

Prof. Thomas’ work is funded by a grant from the National Science Foundation.

Global research uses computing services to advance parenting and child development

By | General Interest, Great Lakes, HPC, News, Research, Uncategorized

Andrew Grogan-Kaylor, professor of Social Work, has spent the past 15 years studying the impact of physical discipline on children within the United States. 

Working with a team of other researchers at the School of Social Work, co-led by professors Shawna Lee and Julie Ma, he recently expanded his research to include children from all over the world, rather than exclusively the U.S. Current data for 62 low- and middle-income countries has been provided by UNICEF, a United Nations agency responsible for providing humanitarian and developmental aid to children worldwide. This data provides a unique opportunity to study the positive things that parents do around the world.

a group of smiling children

(Image by Eduardo Davad from Pixabay)

“We want to push research on parenting and child development in new directions. We want to do globally-based, diversity-based work, and we can’t do that without ARC services,” said Grogan-Kaylor. “I needed a bigger ‘hammer’ than my laptop provided.” 

The “hammer” he’s referring to is the Great Lakes HPC cluster. It can handle processing the large data set easily. When Grogan-Kaylor first heard about ARC, he thought it sounded like an interesting way to grow his science, and that included the ability to run more complicated statistical models that were overwhelming his laptop and department desktop computers. 

He took a workshop led by Bennet Fauber, ARC senior applications programmer/analyst, and found Bennet to be sensible and friendly. Bennet made HPC resources feel within reach to a newcomer. Typically, Grogan-Kaylor says, this type of resource is akin to learning a new language, and he’s found that being determined and persistent and finding the right people are key to maximizing ARC services. Bennet has explained error messages, how to upload data, and how to schedule jobs on Great Lakes. He also found a friendly and important resource at the ARC Help Desk, which is staffed by James Cannon. Lastly, departmental IT director Ryan Bankston has been of enormous help in learning about the cluster.

“We’re here to help researchers do what they do best. We can handle the technology, so they can solve the world’s problems,” said Brock Palen, ARC director. 

“Working with ARC has been a positive, growthful experience, and has helped me contribute significantly to the discussion around child development and physical punishment,” said Grogan-Kaylor. “I have a vision of where I’d like our research to go, and I’m pleased to have found friendly, dedicated people to help me with the pragmatic details.” 

More information

ARC, LSA support groundbreaking global energy tracking

By | General Interest, Great Lakes, HPC, News, Research, Uncategorized

How can technology services like high-performance computing and storage help a political scientist contribute to more equal access to electricity around the world? 

Brian Min, associate professor of political science and research associate professor with the Center for Political Studies, and lead researcher Zachary O’Keeffe have been using nightly satellite imagery to generate new indicators of electricity access and reliability across the world as part of the High-Resolution Electricity Access (HREA) project. 

The collection of satellite imagery is unique in its temporal and spatial coverage. For more than three decades, images have captured nighttime light output over every corner of the globe, every single night. By studying small variations in light output over time, the goal is to identify patterns and anomalies to determine if an area is electrified, when it got electrified, and when the power is out. This work yields the highest resolution estimates of energy access and reliability anywhere in the world.

A satellite image of Kenya in 2017

This image of Kenya from 2017 shows a model-based classification of electrification status based upon all night statistically recalibrated 2017 VIIRS light output. (Image courtesy Dr. Min. Sources: NOAA, VIIRS DNB, Facebook/CIESIN HRSL).

LSA Technology Services and ARC both worked closely with Min’s team to relieve pain points and design highly-optimized, automated workflows. Mark Champe, application programmer/analyst senior, LSA Technology Services, explained that, “a big part of the story here is finding useful information in datasets that were created and collected for other purposes. Dr. Min is able to ask these questions because the images were previously captured, and then it becomes the very large task of finding a tiny signal in a huge dataset.”

There are more than 250 terabytes of satellite imagery and data, across more than 3 million files. And with each passing night, the collection continues to grow. Previously, the images were not easily accessible because they were archived in deep storage in multiple locations. ARC provides processing and storage at a single place, an important feature for cohesive and timely research. 

The research team created computational models that run on the Great Lakes High-Performance Computing Cluster, and that can be easily replicated and validated. They archive the files on the Locker Large-File Storage service

One challenge Min and O’Keeffe chronically face is data management. Images can be hundreds of megabytes each, so just moving files from the storage service to the high-performance computing cluster can be challenging, let alone finding the right storage service. Using Turbo Research Storage and Globus File Transfer, Min and O’Keeffe found secure, fast, and reliable solutions to easily manage their large, high-resolution files.

Brock Palen, director of ARC, said that top speeds were reached when moving files from Great Lakes to Turbo at 1,400 megabytes per second. 

Min and team used Globus extensively in acquiring historical data from the National Oceanic and Atmospheric Administration (NOAA). Champe worked with the research team to set up a Globus connection to ARC storage services. The team at NOAA was then able to push the data to U-M quickly and efficiently. Rather than uploading the data to later be downloaded by Min’s team, Globus streamlined and sped up the data transfer process. 

Champe noted, “Over 100TB of data was being unarchived from tape and transferred between institutions. Globus made that possible and much less painful to manage.”

“The support we’ve gotten from ARC and LSA Technology has been incredible. They have made our lives easier by removing bottlenecks and helping us see new ways to draw insights from this unique data,” said Min. 

Palen added, “We are proud to partner with LSA Technology Services and ITS Infrastructure networking services to provide support to Dr. Min’s and O’Keeffe’s work. Their work has the potential to have a big impact in communities around the world.” 

“We should celebrate work such as this because it is a great example of impactful research done at U-M that many people helped to support,” Champe continued.

Min expressed his gratitude to the project’s partners. “We have been grateful to work with the World Bank and NOAA to generate new insights on energy access that will hopefully improve lives around the world.”

These images are now available via open access (free and available to all)

This is made possible by a partnership between the University of Michigan, the World Bank, Amazon Web Services, and NOAA

Using machine learning and the Great Lakes HPC Cluster for COVID-19 research

By | General Interest, Great Lakes, HPC, News, Research, Uncategorized

A researcher in the College of Literature, Science, and the Arts (LSA) is pioneering two separate, ongoing efforts for measuring and forecasting COVID-19: pandemic modeling and a risk tracking site

The projects are led by Sabrina Corsetti, a senior undergraduate student pursuing dual degrees in honors physics and mathematical sciences, and supervised by Thomas Schwarz, Ph.D., associate professor of physics. 

The modeling uses a machine learning algorithm that can forecast future COVID-19 cases and deaths. The weekly predictions are made using the ARC-TS Great Lakes High-Performance Computing Cluster, which provides the speed and dexterity to run the modeling algorithms and data analysis needed for data-informed decisions that affect public health. 

Each week, 51 processes (one for each state and one for the U.S.) are run in parallel (at the same time). “Running all 51 analyses on our own computers would take an extremely long time. The analysis places heavy demands on the hardware running the computations, which makes crashes somewhat likely on a typical laptop. We get all 51 done in the time it would take to do 1,” said Corsetti. “It is our goal to provide accurate data that helps our country.”

The predictions for the U.S. at the national and state levels are fed into the COVID-19 Forecasting Hub, which is led by the UMass-Amherst Influenza Forecasting Center of Excellence based at the Reich Lab. The weekly predictions generated by the hub are then read out by the CDC for their weekly forecast updates Center for Disease Control (CDC) COVID-19 Forecasting Hub

The second project, a risk tracking site, involves COVID-19 data-acquisition from a Johns Hopkins University repository and the Michigan Safe Start Map. This is done on a daily basis, and the process runs quickly. It only takes about five minutes, but the impact is great. The data populates the COVID-19 risk tracking site for the State of Michigan that shows by county the total number of COVID-19 cases, the average number of new cases in the past week, and the risk level.

“Maintaining the risk tracking site requires us to reliably update its data every day. We have been working on implementing these daily updates using Great Lakes so that we can ensure that they happen at the same time each day. These updates consist of data pulls from the Michigan Safe Start Map (for risk assessments) and the Johns Hopkins COVID-19 data repository (for case counts),” remarked Corsetti.

“We are proud to support this type of impactful research during the global pandemic,” said Brock Palen, director of Advanced Research Computing – Technology Services. “Great Lakes provides quicker answers and optimized support for simulation, machine learning, and more. It is designed to meet the demands of the University of Michigan’s most intensive research.”

ARC is a division of Information and Technology Services (ITS). 

Related information 

Bring the power of the HPC clusters to your laptop 

By | Great Lakes, HPC, News

Open OnDemand (OOD) is a tool that brings to researchers and students the power of Great Lakes, the university’s flagship open-science, high-performance, computing cluster. 

Open OnDemand is a way for researchers and students to use a web interface to access the Advanced Research Computing – Technology Services (ARC-TS) Great Lakes and Lighthouse High-Performance Computing resources. Because users do not need to have any technical training, it’s as simple as going to a browser and logging in. Users can start working immediately. 

“It’s your laptop, but 1,000 times bigger,” said Brock Palen, director, ARC. “Open OnDemand offers our customers the speed and capacity of the HPC clusters without investing hours in training.”

The benefits of OOD are many, including providing easy file management, command-line shell access to the HPC clusters, job management and monitoring, and graphical desktop environments and desktop interactive applications such as RStudio, MATLAB, and Jupyter Notebook.

“This system works well for a range of fields from engineering to the physical and social sciences. Open OnDemand has lowered the barrier to access powerful HPC clusters so that students and researchers can do incredibly innovative work,” said Matt Britt, ARC HPC manager. 

Additional resources:

ARC is a division of Information and Technology Services (ITS).

3-2-1…blast off! COE students use ARC-TS HPC clusters for rocket design

By | Educational, General Interest, Great Lakes, Happenings, HPC, News
MASA team photo

The MASA team has been working with the ARC-TS and the Great Lakes High-Performance Computing Clusters to rapidly iterate simulations. What previously took six hours on another cluster, takes 15 minutes on Great Lakes. (Image courtesy of MASA)

This article was written by Taylor Gribble, the ARC-TS summer 2020 intern. 

The Michigan Aeronautical Science Association (MASA) is a student-run engineering team at U-M that has been designing, building, and launching rockets since its inception in 2003. Since late 2017, MASA has focused on developing liquid-bipropellant rockets—which are rockets that react to a liquid fuel with a liquid oxidizer to produce thrust—in an effort to remain at the forefront of collegiate rocketry. The team is made up of roughly 70 active members including both undergraduate and graduate students who participate year-round.

Since 2018, MASA has been working on the Tangerine Space Machine (TSM) rocket which aims to be the first student-built liquid-bipropellant rocket to ever be launched to space. When completed, the rocket’s all-metal airframe will stand over 25 feet tall. The TSM will reach an altitude of 400,000 feet and will fly to space at over five times the speed of sound.

MASA is building this rocket as part of the Base 11 Space Challenge which was organized by the Base 11 Organization to encourage high school and college students to get involved in STEM fields. The competition has a prize of $1 million, to be awarded to the first team to successfully reach space. MASA is currently leading the competition, having won Phase 1 of the challenge in 2019 with the most promising preliminary rocket design.

Since the start of the TSM project, MASA has made great strides towards achieving its goals. The team has built and tested many parts of the complete system, including custom tanks, electronics, and ground support equipment. In 2020, the experimental rocket engine designed by MASA for the rocket broke the student thrust record when it was tested, validating the work that the team had put into the test.

The team’s rapid progress was made possible in-part by the extensive and lightning-quick simulations using the ARC-TS Great Lakes High-Performance Computing Cluster.

The student engineers are Edward Tang, Tommy Woodbury, and Theo Rulko, and they have been part of MASA for over two years.

Tang is MASA’s aerodynamics and recovery lead and a junior studying aerospace engineering with a minor in computer science. His team is working to develop advanced in-house flight simulation software to predict how the rocket will behave during its trip to space.

“Working on the Great Lakes HPC Cluster allows us to do simulations that we can’t do anywhere else. The simulations are complicated and can be difficult to run. We have to check it, and do it again; over and over and over,” said Tang. “The previous computer we used would take as long as six hours to render simulations. It took 15 minutes on Great Lakes.”

A computer simulation of Liquid Oxygen Dome Coupled Thermal-Structural

This image shows a Liquid Oxygen Dome Coupled Thermal-Structural simulation that was created on the ARC-TS Great Lakes HPC Cluster. (Image courtesy of MASA)

Rulko, the team’s president, is a junior studying aerospace engineering with a minor in materials science and engineering.

Just like Tang, Rulko has experience using the Great Lakes cluster. “Almost every MASA subteam has benefited from access to Great Lakes. For example, the Structures team has used it for Finite Element Analysis simulations of complicated assemblies to make them as lightweight and strong as possible, and the Propulsion team has used it for Computational Fluid Dynamics simulations to optimize the flow of propellants through the engine injector. These are both key parts of what it takes to design a rocket to go to space which we just wouldn’t be able to realistically do without access to the tools provided by ARC-TS.”

Rulko’s goals for the team include focusing on developing as much hardware/software as possible in-house so that members can control and understand the entire process. He believes MASA is about more than just building rockets; his goal for the team is to teach members about custom design and fabrication and to make sure that they learn the problem-solving skills they need to tackle real-world engineering challenges. “We want to achieve what no other student team has.”

MASA has recently faced unforeseen challenges due to the COVID-19 pandemic that threaten to hurt not only the team’s timeline but also to derail the team’s cohesiveness. “Beaucase of the pandemic, the team is dispersed literally all over the world. Working with ARC-TS has benefitted the entire team. The system has helped us streamline and optimize our workflow, and has made it easy to connect to Great Lakes, which allows us to rapidly develop and iterate our simulations while working remotely from anywhere,” said Tang. “The platform has been key to allowing us to continue to make progress during these difficult times.”

Tommy Woodbury is a senior studying aerospace engineering. Throughout his time on MASA he has been able to develop many skills. “MASA is what has made my time here at Michigan a really positive experience. Having a group of highly-motivated and supportive individuals has undoubtedly been one of the biggest factors in my success transferring to Michigan.

This image depicts the Liquid Rocket Engine Injector simulation.

This image depicts the Liquid Rocket Engine Injector simulation. (Image courtesy of MASA)

ARC-TS is a division of Information and Technology Services. Great Lakes is available without charge for student teams and organizations who need HPC resources. This program aims to enable students access to high-performance computing to enhance their team’s mission.