Tag

storage

PFAS research in the Michigan mother-infant pairs study, supported by ITS, SPH, MM, AGC

By | News

Three mothers holding their infants. Everyone is sitting on a couch..PFAS (per- and polyfluoroalkyl substances) are a class of chemicals that have been around since the 1940s and became more broadly used in the post-war 1960s era. PFAS are in our homes, offices, water, and even our food and blood. PFAS break down slowly and are difficult to process, both in the environment and our bodies. 

Scientific studies have shown that exposure to some PFAS in the environment may be linked to harmful health effects in humans and animals. Because there are thousands of PFAS chemicals found in many different consumer, commercial, and industrial products, it is challenging to study and assess the human health and environmental risks. 

Fortunately, some of the most persistent PFAS are being phased out. The EPA has been working on drinking water protections, scientists are working on ways to break down and eliminate PFAS, and PFAS are being addressed at a national level

A team of University of Michigan researchers from the School of Public Health DoGoodS-Pi Environmental Epigenetics Lab and Michigan Medicine are working to understand how behaviors and environments during pregnancy can cause changes to the way genes work in offspring. This emerging field is known as toxicoepigenetics. 

Jackie Goodrich, Ph.D., research associate professor at the U-M School of Public Health, led the team. “PFAS may impact the development of something we all have called the epigenome. The epigenome is a set of modifications on top of our DNA that controls normal development and function. Environmental exposures like PFAS can alter how the epigenome forms, and this impacts development and health. Our study expands on current knowledge about PFAS and the epigenome by focusing on a type of epigenetic mark that is not usually measured.”

Vasantha Padmanabhan, Ph.D., M.S., professor emerita (in service), Department of Pediatrics, Michigan Medicine, built the Michigan Mother-Infant Pairs study over the past decade with an emphasis on identifying harmful exposures during pregnancy that impact women and their newborns. “I am so grateful to those who engaged in this study. PFAS are complex, and mothers’ and infants’ involvement helped us work toward a solution that impacts us all. I want to acknowledge the contributions of the U-M Department of Obstetrics and Gynecology, Michigan Institute for Clinical & Health Research (MICHR), and the Von Voigtlander Women’s Hospital that made this study possible.” 

Rebekah Petroff, Ph.D., a research fellow with Environmental Health Sciences, led the computation portion of the research. She said that using Turbo for storing the raw data and Great Lakes for high-performance computing (HPC) enabled a much faster analysis that was needed for the study with so much data to analyze. 

Turbo and Great Lakes are services provided by Advanced Research Computing, a division of Information and Technology Services (ITS). ARC facilitates powerful approaches to complex research challenges in fields ranging from physics to linguistics, and from engineering to medicine.

Petroff said, “This analysis would have taken over a month straight of computing time on a regular desktop computer. The first job we submitted to Great Lakes ran so fast—I had results the next morning! Great Lakes made this research possible, and I believe that our study results can be broadly impactful to public health and toxicoepigenetics going forward.”

Support for using this complex technology also came from Dan Barker, a UNIX systems admin with the U-M School of Public Health Biostatistics Department. Barker assisted with the code needed to use Great Lakes. “We started with a test run of a few hundred pairs of genomes. Once we were successful with that, we ran the entire nearly 750,000 epigenetic marks across 141 people and seven different PFAS.”

Barker also helped design and submit array jobs which are a series of identical, or near identical, tasks that are run multiple times. This is a common technique used by researchers when leveraging HPC. Array jobs allow for essential analytical comparisons among the test results. Petroff said, “In our study, we used an array job to split up our computations so that they ran much more efficiently!”

The U-M Advanced Genomics Core (AGC) performed the epigenetic assays, a kind of laboratory technique which measures marks on your DNA, for this project. AGC is part of the campus-wide laboratories that develop and provide state-of-the-art scientific resources to enable biomedical research known as Biomedical Research Core Facilities (BRCF). Other BRCF cores also worked on this project, including the Epigenomics Core and the Bioinformatics Core.

Genotyping is similar to reading a few words scattered on a page. This process gives researchers small packets of data to compare. Genotyping looks for information at a specific place in the DNA where we know important data will be. This project used a type of genotyping called microarrays (also known as “arrays”) and help researchers understand how regulation of DNA—including methylation and hydroxymethylation measured in this study—are impacted by exposures like PFAS.  

Brock Palen, ARC director, said, “This research is of human interest and impacts all of us. I’m pleased that ARC assisted their research with staff expertise, equipment, and no-cost allocations from the U-M Research Computing Package.”

Petroff said that follow up studies are needed to better understand if the results are universal or specific to this cohort of infants and parents. If the results hold steady, then a significant discovery has been made that will lead to more comprehensive PFAS mitigation solutions. “Although steps are being taken to mitigate PFAS, exposure is still prevalent, and a deeper understanding of how it impacts humans is needed,” said Dana Dolinoy, Ph.D., chair, NSF International Department Chair of Environmental Health Sciences and epigenetics expert.

Read the full article: Mediation effects of DNA methylation and hydroxymethylation on birth outcomes after prenatal per- and polyfluoroalkyl substances (PFAS) exposure in the Michigan mother–infant pairs cohort.

Funding was provided by grants from the National Institutes of Health, the U.S. Environmental Protection Agency, and the National Institute of Environmental Health Sciences Children’s Health Exposure Analysis Resource program.

ARC Summer 2023 Maintenance happening in June

By | HPC, News, Systems and Services

Summer maintenance will be happening earlier this year (June instead of August). Updates will be made to software, hardware, and operating systems to improve the performance and stability of services. ARC works to complete these tasks quickly to minimize the impact of the maintenance on research.

The dates listed below are the weeks the work will be occurring; the actual dates will be revised as planning continues.

HPC clusters and storage systems (/scratch) will be unavailable:

  • June 5-9: Great Lakes, Armis2, and Lighthouse

Storage systems will be unavailable:

  • June 6-7: Turbo, Locker, and Data Den

Queued jobs and maintenance reminders

Jobs will remain queued, and will automatically begin after the maintenance is completed. The command “maxwalltime” will show the amount of time remaining until maintenance begins for each cluster, so you can size your jobs appropriately. The countdown to maintenance will also appear on the ARC homepage

Status updates

How can we help you?

For assistance or questions, contact ARC at arc-support@umich.edu.

Data Den now supports sensitive data

By | News, Uncategorized

Data Den Research Archive is a service for preserving electronic data generated from research activities. It is a low-cost, highly durable storage system and is the largest storage system operated by ARC. Storing of sensitive data (including HIPAA, PII, and FERPA) is now supported (visit the Sensitive Data Guide for full details). This service is part of the U-M Research Computing Package (UMRCP) that provides storage allocations to researchers. Most researchers will not have to pay for Data Den. 

A disk-caching, tape-backed archive, this storage service is best for data that researchers do not need regularly, but still need to keep because of grant requirements. 

“Data Den is a good place to keep research data past the life of the grant,” said Jeremy Hallum, ARC research computing manager. “ARC can store data that researchers need to keep for five to ten years.” 

Hallum goes on to say that Data Den is only available in a replicated format. “Volumes of data are duplicated between servers or clusters for disaster recovery so research data is very safe.”

Data Den can be part of a well-organized data management plan providing international data sharing, encryption, and data durability. Jerome Kinlaw, ARC research storage lead, said that the Globus File Transfer service works well for data management. “Globus is easy to use for moving data in and out of Data Den.”

The ITS U-M Research Computing Package (UMRCP) provides 100 terabytes (TB) of Data Den storage to qualified researchers. This 100 TB can be divided between restricted and non-restricted variants of Data Den for use as needed. (The ITS Data Storage Finder can help researchers find the right storage solutions to meet their needs.)

“I’m pleased that Data Den now offers options for sensitive data, and that researchers can take advantage of the UMRCP allocations,” said Brock Palen, ARC director. “We want to lighten the load so that researchers can do what they do best, and our services are now more cost effective than ever.”

Flux transfer servers will be decommissioned on Jan. 9; use new endpoints

By | Feature, General Interest, News

To provide faster transfers of data from ARC services, ARC will be decommissioning the Flux-Xfer servers on January 9, 2023. You will need to update how you migrate your data. 

For everyone who uses ARC storage services, especially Data Den users: This message is VERY important! This change includes the use of scp from Flux-Xfer, as well as the Globus endpoint umich#flux. Any shared endpoints that you have created on umich#flux will be automatically migrated to a new Globus collection on January 9. Those who use Data Den should take special interest in Item 1 listed below. 

Action item – Use the new endpoints

  1. If you currently use globus and umich#flux to access your Locker or Data Den volume, you should use the Globus Collection ‘UMich ARC Locker Non-Sensitive Volume Collection’ for Locker, and the Globus Collection ‘UMich ARC Data Den Non-Sensitive Volume Collection’ for Data Den. 
  2. If you currently use globus and umich#flux to access your Turbo volumes, you should use the Globus Collection ‘UMich ARC Turbo Non-Sensitive Volume Collection’.
  3. If you currently use globus and umich#flux to access other storage volumes, you should use the Globus Collection ‘umich#greatlakes’.  
  4. If you currently use scp on flux-xfer to copy data to/from Turbo, you should use ‘globus-xfer1.arc-ts.umich.edu’.

User guide 

How can we help you?

For assistance or questions, please contact ARC at arc-support@umich.edu

Preserving Michigan’s musical history and culture

By | Feature, News, Research

From Kentucky bluegrass to Louisiana Zydeco to German hurdy-gurdy to East European Klezmer to Indian Manipuri dancing to Native American pow wows, and much more, these musical traditions from around the country and around the world have found their way to Michigan. Beginning in 2014, the Musical Heritage Project has been documenting Michigan’s folk music history.

Lester Monts

Lester Monts Lester Monts specializes in ethnomusicology and has been documenting Michigan’s folk cultural heritage since 2014. (Image courtesy Lester Monts)

The project is led by ethnomusicologist Dr. Lester P. Monts, Arthur F. Thurnau Professor Emeritus of Music, who began his musical journey as an orchestral trumpet player. He earned bachelor’s and master’s degrees in trumpet performance and teaching trumpet at the college level before completing the doctoral degree in ethnomusicology and embarking on a research career. In the mid-1970s, Monts began to focus his research on music and culture in Liberia and Sierra Leone in West Africa. The fourteen-year Liberian civil war thwarted his fieldwork in that region.

Noting that there has been no systematic effort to collect and archive Michigan’s rich folk music heritage, the Michigan Musical Heritage Project was launched. Monts has embraced the study of music from the cultural and social aspects of the people who make it. He notes that “music brings people together; it has the power to create community, and we witnessed this occurring throughout our many journeys around the state.”

Using his charm, passion, likeability, and keen musical knowledge to cultivate trust with his interviewees, Monts captured more than 400 hours of audio and video data over the years, amassing a total of 80 terabytes of data. He believes this to be the most extensive collection of Michigan folk music in the state and that U-M is the right place to house this collection.

The Michigan Musical Heritage Project crew.

The Michigan Musical Heritage Project crew wraps up at the end of recording session. (Image courtesy Lester Monts)

With a videography crew consisting primarily of former U-M students, Monts traveled all around the state to record performances at folk music festivals and cultural gatherings, such as the Celtic Festival (Saline), Irish Folk Music Festival (Muskegon) Hispanic Heritage Festival (Hart), Hiawatha Traditional Music Festival (Marquette), Port Sanilac Blues Festival (Port Sanilac), Africa World Festival (Detroit), Aura Jamboree (Aura), Oldtime Fiddlers Convention and Traditional Music Festival (Hillsdale).

He says, “The creative talents of the state’s outstanding musicians must be preserved, not only for my research but for that of others as well. If properly preserved, I’m confident that in the future, the ethnomusicology program and the American Cultures department will find these data provide important insights into Michigan’s diverse musical heritage.”

How technology supports this project 

Monts’ crew includes a strong partnership with Tom Bray, converging technologies consultant and adjunct assistant professor of Art and Design, Penny W. Stamps School of Art and Design. Bray has been instrumental in pairing the right technology for the long-term preservation of this collection, which includes converting older footage to digital media. 

Tom Bray

Tom Bray (image courtesy LSA)

Bray has collaborated with Monts to convert older technologies, such as VHS, 8mm, and high-8 video, to digital files. The files are both compressed and uncompressed and are very large and of high resolution.

All of this wonderful and important audio and video footage needs to be preserved somewhere. But where do you turn when you have 80 terabytes of data? Monts said, “I’ve been desperately searching for a way to archive the video data collected under the auspices of the Michigan Musical Heritage Project.” 

Enter the U-M Research Computing Package (UMRCP) and the team from Advanced Research Computing (ARC), a division of Information and Technology Services. The UMRCP offers researchers across all campuses several resources at no additional cost to researchers, including 100 terabytes of long-term storage.

Bray said, “I had to read the UMRCP email announcement twice because I couldn’t believe my eyes. I was so excited that ITS and the university are supporting researchers in this way. We jumped on this opportunity right away.” 

ARC Director Brock Palen is excited about this work, too. “This is super interesting, and not like the usual types of research ARC normally sees, like climate and genomics. We’re happy to help Dr. Monts and Mr. Bray, and anyone who needs it, anytime. The archive is intentionally built for holding large-volume, raw data such as 4k video, and we are proud to be their go-to for this important cultural preservation project.” 

Old media in Dr. Monts' office

Hours and hours of media is being converted to a digital format. (Photo by Stephanie Dascola)

ARC replicates and encrypts in two secure locations that are miles apart, so those who use ARC services will not have to worry about crashes that they might experience if they are using their own equipment. The UMRCP also includes technical expertise by talented ARC staff to further remove barriers so researchers can do what they do best.

Monts and Bray also leverage the university’s network and WiFi services to transfer the files from their studio in the Duderstadt Center to storage. The network is designed to minimize bottlenecks so that data transfers quickly and efficiently. 

Dr. Monts said, “Although the pandemic temporarily disrupted my plans to complete the video documentary, I take solace in knowing that the many hours of data we collected is in a much safer environment than we had. The UMRCP storage resource is truly a boon!”

Related links

An old reel-to-reel tape player.

A reel-to-reel tape player. (Photo by Stephanie Dascola)

Lester Monts plays footage from a special women's only dance in Iberia.

Dr. Monts shows footage from a special women-only dance in Iberia. He earned permission to record this rarely-documented group of women. (Photo by Stephanie Dascola)

No-cost research computing allocations now available

By | HPC, News, Research, Systems and Services, Uncategorized

U-M Research Computing PackageResearchers on all university campuses can now sign up for the U-M Research Computing Package, a new package of no-cost supercomputing resources provided by Information and Technology Services.

As of Sept. 1, university researchers have access to a base allocation for 80,000 CPU hours of high-performance computing and research storage services at no cost. This includes 10 terabytes of high-speed and 100 terabytes of archival storage.

These base allocations will meet the needs of approximately 75 percent of current high-performance-computing users and 90 percent of current research storage users. Researchers must sign up on ITS’s Advanced Research Computing website to receive the allocation.

“With support from President (Mark) Schlissel and executive leadership, this initiative provides a unified set of resources, both on campus and in the cloud, that meet the needs of the rich diversity of disciplines. Our goal is to encourage the use, support and availability of high-performance computing resources for the entire research community,” said Ravi Pendse, vice president for information technology and chief information officer.

The computing package was developed to meet needs across a diversity of disciplines and to provide options for long-term data management, sharing and protecting sensitive data, and more competitive cost structures that give faculty and research teams more flexibility to procure resources on short notice.

“It is incredibly important that we provide our research community with the tools necessary so they can use their experience and expertise to solve problems and drive innovation,” said Rebecca Cunningham, vice president for research and the William G. Barsan Collegiate Professor of Emergency Medicine. “The no-cost supercomputing resources provided by ITS and Vice President Pendse will greatly benefit our university community and the countless individuals who are positively impacted by their research.”

Ph.D. students may qualify for their own UMRCP resources depending on who is overseeing their research and their adviser relationship. Students should consult with their Ph.D. program administrator to determine their eligibility. ITS will confirm this status when a UMRCP request is submitted.

Undergraduate and master’s students do not currently qualify for their own UMRCP, but they can be added as users or administrators of another person’s UMRCP. Students can also access other ITS programs such as Great Lakes for Course Accounts, and Student Teams.

“If you’re a researcher at Michigan, these resources are available to you without financial impact. We’re going to make sure you have what you need to do your research. We’re investing in you as a researcher because you are what makes Michigan Research successful,” Brock Palen, Advanced Research Computing director.

Services that are needed beyond the base allocation provided by the UMRCP are available at reduced rates and are automatically available for all researchers on the Ann Arbor, Dearborn, Flint and Michigan Medicine campuses.

More Information

HPC, storage now more accessible for researchers

By | HPC, News, Systems and Services

U-M Research Computing Package decorative image

Information and Technology Services has launched a new package of supercomputing resources for researchers and PhD students on all U-M campuses: the U-M Research Computing Package, provided by ITS.

The U-M Research Computing Package will reduce the current rates for high performance computing and research storage services provided by ITS by an estimated 35-40 percent, effective July 1. 

In addition, beginning Sept. 1, university researchers will have access to a base allocation for high-performance computing and research storage services (including high-speed and archival storage) at no cost, thanks to an additional investment from ITS. These base allocations will meet the needs of approximately 75 percent of current high-performance computing users and 90 percent of current research storage users.

Learn more about the U-M Research Computing Package

 

Using tweets to understand climate change sentiment

By | HPC, News, Research, Systems and Services

A team from Urban Sustainability Research Group of the School for Environment and Sustainability (UM-SEAS) has been studying public tweets to understand climate change and global warming attitudes in the U.S. 

Dimitris Gounaridis, is a fellow with the study. The team is mentored by Joshua Newell, and combines work about perceptions on climate change by Jianxun Yang and proprietary level vulnerability assessment by Wanja Waweru

“This research is timely and urgent. It helps us identify hazards, and elevated risks of flooding and heat, for socially vulnerable communities across the U.S. This risk is exacerbated especially for populations that do not believe climate change is happening,” Dimitris stated. 

The research team used a deep learning algorithm that is able to recognize text and predict whether the person tweeting believes in climate change or not. The algorithm analyzed a total of 7 million public tweets from a combination of datasets from a dataset called the U-M Twitter Decahose and the George Washington University Libraries Dataverse. This dataset consists of an historical archive of Decahose tweets and an ongoing collection from the Decahose. The current deep learning model has an 85% accuracy rate and is validated at multiple levels.

The map below shows the prediction of specific users that believe or are skeptical of climate change and global warming. Dimitris used geospatial modeling techniques to identify clusters of American skepticism and belief to create the map.

A map of the United States with blue and red dots indicating climate change acceptance.

(Image courtesy Dimitris Gounaridis.)

The tweet stream is sampled in real-time. Armand Burks, a research data scientist with ARC, wrote the Python code that is responsible for continuously collecting the data and storing it in Turbo Research Storage. He says that many researchers across the university are using this data for various research projects as well as classes. 

“We are seeing an increased demand for shared community data sets like the Decahose. ARC’s platforms like Turbo, ThunderX, and Great Lakes, hold and process that data, and our data scientists are available, in partnership with CSCAR, to assist in deriving meaning from such large data. 

“This is proving to be an effective way to combine compute services, methodology, and campus research mission leaders to make an impact quickly,” said Brock Palen, director of ARC.

In the future, Dimitris plans to refine the model to increase its accuracy, and then combine that with climate change vulnerability for flooding and heat stress.

“MIDAS is pleased that so many U-M faculty members are interested in using the Twitter Decahose. We currently have over 40 projects with faculty in the Schools of Information, Kinesiology, Social Work, and Public Health, as well as at Michigan Ross, the Ford School, LSA and more,” said H.V. Jagadish, MIDAS director and professor of Electrical Engineering and Computer Science

The Twitter Decahose is co-managed and supported by MIDAS, CSCAR, and ARC, and is available to all researchers without any additional charge. For questions about the Decahose, email Kristin Burgard, MIDAS outreach and partnership manager.

ARC-TS seeks pilot users for two new research storage services

By | General Interest, Happenings, HPC, News

Advanced Research Computing – Technology Services (ARC-TS) is seeking pilot users for two new research storage services.

The first, Locker, is group project storage focused on large data sets, and is available at a cost less than half that of current primary storage services. Locker still provides encryption, replication, snapshots, and workstation access. Example use cases for Locker are research projects in climate studies, genomics, imaging, and other data-intensive sciences.

The second service, Data Den, provides archive class storage for research data that is not actively used. As our lowest cost research storage offering, Data Den provides “cold storage” for massive amounts of data with 20 petabytes of encrypted and replicated capacity. Data Den allows researchers to preserve data between rounds of funding and management plans, and to free up space in more expensive primary storage by moving valuable, but not currently used, data.

Those interested in participating in the pilots should contact ARC-TS at hpc-support@umich.edu.

Turbo High Performance Research Storage grows 2PB and increases speed

By | General Interest, Happenings, HPC, News

Turbo Research Storage, the high performance research storage option available to researchers anywhere on campus, was recently expanded 2PB of new encrypted capacity. This new capacity allows Turbo to keep up with the growth of research data while also increasing performance with expanded caches and more network connectivity.

The work also increased Turbo’s performance to campus and ARC-TS resources by 50 percent to 60Gbps. A plan was also approved allowing for Turbo to grow to 160Gbps with room to 320 Gbps performance between Turbo and the newly announced HPC system Great Lakes.