XSEDE: Big Data and Machine Learning

By |

OVERVIEW

XSEDE, along with the Pittsburgh Supercomputing Center, is pleased to present a two day Big Data and Machine Learning workshop.

This workshop will focus on topics such as Hadoop and Spark and will be presented using the Wide Area Classroom (WAC) training platform.

 

Please see this site for more information

 

Due to COVID-19, this workshop will be remote, using Zoom.

Register by going to: https://portal.xsede.org/xup/course-calendar or If you do not currently have an XSEDE Portal account, you will need to create one:

https://portal.xsede.org/my-xsede?p_p_id=58&p_p_lifecycle=0&p_p_state=maximized&p_p_mode=view&_58_struts_action=%2Flogin%2Fcreate_account

Should you have any problems with that process, please contact help@xsede.org and they will provide assistance.

 

Introduction to Research Computing on the Great Lakes Cluster

By |

OVERVIEW

This workshop will introduce you to high performance computing on the Great Lakes cluster.  After a brief overview of the components of the cluster and the resources available there, the main body of the workshop will cover creating batch scripts and the options available to run jobs, and hands-on experience in submitting, tracking, and interpreting the results of submitted jobs. By the end of the workshop, every participant should have created a submission script, submitted a job, tracked its progress, and collected its output. Additional tools including high-performance data transfer services and interactive use of the cluster will also be covered.

To register and view more details, please refer to the linked TTC page.

Introduction to the Linux Command Line

By |

OVERVIEW

This course will familiarize the student with the basics of accessing and interacting with Linux computers using the GNU/Linux operating system’s Bash shell, also generically referred to as “the command line”. Topics include: a brief overview of Linux, the Bash shell, navigating the file system, basic commands, shell redirection, permissions, processes, and the command environment. The workshop will also provide a quick introduction to nano a simple text editor that will be used in subsequent workshops to edit files.

 

To register and view more details, please refer to the linked TTC page

Advanced research computing on the Great Lakes Cluster

By |

OVERVIEW

This workshop will cover some more advanced topics in computing on the U-M Great Lakes Cluster. Topics to be covered include a review of common parallel programming models and basic use of Great Lakes; dependent and array scheduling; workflow scripting using bash; high-throughput computing using launcher; parallel processing in one or more of Python, R, and MATLAB; and profiling of parallel code using Allinea Performance Reports and Allinea MAP.

 

To register and view more details, please refer to the linked TTC page.

Introduction to Research Computing on the Great Lakes Cluster

By |

OVERVIEW

This workshop will introduce you to high performance computing on the Great Lakes cluster.  After a brief overview of the components of the cluster and the resources available there, the main body of the workshop will cover creating batch scripts and the options available to run jobs, and hands-on experience in submitting, tracking, and interpreting the results of submitted jobs. By the end of the workshop, every participant should have created a submission script, submitted a job, tracked its progress, and collected its output. Additional tools including high-performance data transfer services and interactive use of the cluster will also be covered.

To register and view more details, please refer to the linked TTC page.

Introduction to the Linux Command Line

By |

OVERVIEW

This course will familiarize the student with the basics of accessing and interacting with Linux computers using the GNU/Linux operating system’s Bash shell, also generically referred to as “the command line”. Topics include: a brief overview of Linux, the Bash shell, navigating the file system, basic commands, shell redirection, permissions, processes, and the command environment. The workshop will also provide a quick introduction to nano a simple text editor that will be used in subsequent workshops to edit files.

 

To register and view more details, please refer to the linked TTC page

Data Sharing and Archiving

By |

OVERVIEW

For growing data volumes, how we manage data becomes more important. This session will cover the basics of managing data in a research environment such as those at ARC and nationally. Attendees of the course will be introduced to recommended tools for data sharing and transfer both on campus, off campus, and cloud.  They will learn how to prepare data for archive, including special high performance versions of tar and compression allowing significant performance benefits over the standard versions of the tools.
Lastly we will cover the properties and selection process of the appropriate general purpose  storage for data that requires long term preservation and active archiving that supports the largest data volumes in a way that controls costs and ease of management.
Requirements are basic command line.
To register and view more details, please refer to the linked TTC page.  

 

Advanced research computing on the Great Lakes Cluster

By |

OVERVIEW

This workshop will cover some more advanced topics in computing on the U-M Great Lakes Cluster. Topics to be covered include a review of common parallel programming models and basic use of Great Lakes; dependent and array scheduling; workflow scripting using bash; high-throughput computing using launcher; parallel processing in one or more of Python, R, and MATLAB; and profiling of parallel code using Allinea Performance Reports and Allinea MAP.

To register and view more details, please refer to the linked TTC page.

Data Sharing and Archiving

By |

OVERVIEW

For growing data volumes, how we manage data becomes more important. This session will cover the basics of managing data in a research environment such as those at ARC and nationally. Attendees of the course will be introduced to recommended tools for data sharing and transfer both on campus, off campus, and cloud.  They will learn how to prepare data for archive, including special high performance versions of tar and compression allowing significant performance benefits over the standard versions of the tools.
Lastly we will cover the properties and selection process of the appropriate general purpose  storage for data that requires long term preservation and active archiving that supports the largest data volumes in a way that controls costs and ease of management.
Requirements are basic command line.
To register and view more details, please refer to the linked TTC page

 

Introduction to Research Computing on the Great Lakes Cluster

By |

OVERVIEW

This workshop will introduce you to high performance computing on the Great Lakes cluster.  After a brief overview of the components of the cluster and the resources available there, the main body of the workshop will cover creating batch scripts and the options available to run jobs, and hands-on experience in submitting, tracking, and interpreting the results of submitted jobs. By the end of the workshop, every participant should have created a submission script, submitted a job, tracked its progress, and collected its output. Additional tools including high-performance data transfer services and interactive use of the cluster will also be covered.

To register and view more details, please refer to the linked TTC page.