Category

Flux

Patches being deployed for Meltdown and Spectre attacks

By | Flux, General Interest, Happenings, News

On January 3, two major vulnerabilities in computer chips made by Intel, AMD, ARM and others were made public. Collectively, the two issues are being referred to as Meltdown and Spectre, and they could allow low-privilege processes to access kernel memory that is allocated to other running programs.  Patches have been released at this time for Meltdown by almost every major operating system vendor, and we are in the process of deploying them on our major systems.  Deployment of these patches will result in varying performance impacts, depending on your workload. Based on the high profile nature of the Meltdown hardware vulnerability, along with the existence of examples of exploits in the wild, we have no choice but to deploy patches on all systems. Below we list our mitigation strategies for Meltdown for each system.  Existing patches also fix some Spectre exploits that are known, but there may be further patches as discovery continues.  

For further information regarding Meltdown and Spectre:

ARC-TS Systems:

Flux/Armis:  CentOS has released packages to mitigate against Meltdown.  These packages are being installed during the winter maintenance.

ConFlux: IBM’s PowerPC architecture is not known at this time to be impacted by Meltdown.  The impact of Spectre is still being evaluated.

Flux Hadoop: CentOS has released packages to mitigate against Meltdown.  These packages are being installed during the winter maintenance.

YBRC: We do not anticipate any outages and will use our standard procedure for upgrading.  ARC-TS is working closely with Yottabyte and the upstream sources to get a patch ready to mitigate Meltdown impacts for the platform. The only user-facing impact should be a degradation of storage performance and a brief suspension of networking when a VM migrate hosts (usually drops a single ping). The timeframe for applying the patch is still unknown at this time, but we intend to push forward with patching the hosts as soon as they become available. ARC-TS will send out follow-up notifications before starting the patch process. Applying patches to the various VMs/hosts/containers will require a restart of the affected machine after patches have been applied.  

Turbo: Meltdown impacts to Turbo are low, and we have not received any guidance on Dell/EMC/Isilon procedures at this time.

Potential service disruption for Value Storage maintenance — Dec. 2

By | Flux, General Interest, Happenings, HPC, News

The ITS Storage team will be applying an operating system patch on the MiStorage Silver environment, which provides home directories for both Flux and Flux Hadoop.  The ITS maintenance window will be from December 2nd 11:00pm to December 3rd 7:00am (8 hours total).  This update might be potentially disruptive to the stability of the nodes and jobs running on them.

The ITS status page for this incident is here:  http://status.its.umich.edu/report.php?id=141155

For Flux users: we have created a reservation on Flux so no jobs will be running or impacted.  We will remove the reservation after we receive the update from the ITS storage team of a successful update.

For Flux Hadoop users:  The scheduler and user logins will be deactivated when the outage starts, and any user currently logged into the cluster will be logged out for the duration of the outage.  We will reactivate access when we have received the all-clear from the ITS storage team of a successful update.

Status updates will be posted on the ARC-TS Twitter feed: https://twitter.com/arcts_um  and if you have any questions, please email us at hpc-support@umich.edu.

CSCAR provides walk-in support for new Flux users

By | Data, Educational, Flux, General Interest, HPC, News

CSCAR now provides walk-in support during business hours for students, faculty, and staff seeking assistance in getting started with the Flux computing environment.  CSCAR consultants can walk a researcher through the steps of applying for a Flux account, installing and configuring a terminal client, connecting to Flux, basic SSH and Unix command line, and obtaining or accessing allocations.  

In addition to walk-in support, CSCAR has several staff consultants with expertise in advanced and high performance computing who can work with clients on a variety of topics such as installing, optimizing, and profiling code.  

Support via email is also provided via hpc-support@umich.edu.  

CSCAR is located in room 3550 of the Rackham Building (915 E. Washington St.). Walk-in hours are from 9 a.m. – 5 p.m., Monday through Friday, except for noon – 1 p.m. on Tuesdays.

See the CSCAR web site (cscar.research.umich.edu) for more information.

ARC-TS seeks input on next generation HPC cluster

By | Events, Flux, General Interest, Happenings, HPC, News

The University of Michigan is beginning the process of building our next generation HPC platform, “Big House.”  Flux, the shared HPC cluster, has reached the end of its useful life. Flux has served us well for more than five years, but as we move forward with replacement, we want to make sure we’re meeting the needs of the research community.

ARC-TS will be holding a series of town halls to take input from faculty and researchers on the next HPC platform to be built by the University.  These town halls are open to anyone and will be held at:

  • College of Engineering, Johnson Room, Tuesday, June 20th, 9:00a – 10:00a
  • NCRC Bldg 300, Room 376, Wednesday, June 21st, 11:00a – 12:00p
  • LSA #2001, Tuesday, June 27th, 10:00a – 11:00a
  • 3114 Med Sci I, Wednesday, June 28th, 2:00p – 3:00p

Your input will help to ensure that U-M is on course for providing HPC, so we hope you will make time to attend one of these sessions. If you cannot attend, please email hpc-support@umich.edu with any input you want to share.

HPC maintenance scheduled for January 7 – 9

By | Flux, General Interest, News

To accommodate upgrades to software and operating systems, Flux, Armis, and their storage systems (/home and /scratch) will be unavailable starting at 9am Saturday, January 7th, returning to service on Monday, January 9th.  Additionally, external Turbo mounts will be unavailable 11pm Saturday, January 7th, until 7am Sunday, January 8th.

During this time, the following updates are planned:

  • Operating system and software updates (minor updates) on Flux and Armis.  This should not require any changes to user software or processes.
  • Resource manager and job scheduling software updates.
  • Operating system updates on Turbo.

For HPC jobs, you can use the command “maxwalltime” to discover the amount of time before the beginning of the maintenance. Jobs that cannot complete prior to the beginning of the maintenance will be able to start when the clusters are returned to service.

We will post status updates on our Twitter feed ( https://twitter.com/arcts_um ) and send an email to all HPC users when the outage has been completed.

U-M team uses Flux HPC cluster for pre-surgery simulations

By | Flux, General Interest, News

Last summer, Alberto Figueroa’s BME lab at the University of Michigan achieved an important “first” – using computer-generated blood flow simulations to plan a complex cardiovascular procedure.

“I believe this is the first time that virtual surgical planning was done for real and not as a retrospective theoretical exercise ,” says Figueroa.

Using a patient’s medical and imaging data, Figueroa was able to create a model of her unique vasculature and blood flow, then use it to guide U-M pediatric cardiologists Aimee Armstrong, Martin Bocks, and Adam Dorfman in placing a graft in her inferior vena cava to help alleviate complications from pulmonary arteriovenous malformations (PAVMs). The simulations were done using the Flux HPC cluster.

Read more…

Cluster upgrades completed

By | Flux, General Interest, News

Several key systems were updated and improved during the ARC-TS HPC summer maintenance from July 16 – 23, 2016.

Among other improvements, the updates provide access to more current versions of popular software and libraries, allow new features and more consistent runtimes for job scheduling, and migrate two-factor authentication for the login servers to a new system.

The improvements included:

  • Upgrades to the operating OS and supporting software for the cluster. This was a major update to the previously installed RedHat version (RHEL 6.6) up to CentOS 7.1. This provides newer versions of commonly used software and libraries, and will help us deliver more user-facing features in the coming months.
  • Cluster management software updates and reconfiguration. This includes Torque 6, which has a new set of resource options. The new Torque version will give better language for defining tasks, more consistent runtimes, and a platform for new  features.
  • The Flux Hadoop environment upgrade to Cloudera 5.7, which now includes Hive-On-Spark (the Hadoop cluster will return to service later this week).
  • /scratch on Flux updates.
  • Transition of the software modules environment to a system called Lmod. For more information, see our Lmod transition page. The complete Lmod User Guide can be found here: https://www.tacc.utexas.edu/research-development/tacc-projects/lmod/user-guide.

An HPC 2016 Upgrade Frequently Asked Questions page is available documenting a variety of issues related to the upgrades. Please direct any questions to hpc-support@umich.edu.

New ARC Connect service provides desktop graphical interface for HPC resources

By | Educational, Flux, General Interest, News

Users of ARC-TS computing resources can now use desktop versions of popular software packages like Matlab and R while accessing the Flux shared computing cluster. The new service, called ARC Connect, provides an easily accessible graphical user interface that simplifies doing interactive, graphical work backed by the high performance and large memory capabilities of the Flux cluster.

Using ARC Connect may benefit you if you need to:

  • Very easily interactively use graphical software on HPC clusters (Flux, Armis).
  • Do high performance, interactive visualizations.
  • Share and collaborate with colleagues on HPC-driven research.
  • Use HPC in teaching.

Features:

  • Remote desktop sessions (VNC) for using Flux graphically and interactively.
  • Jupyter notebooks for Python and R (other languages coming soon).
  • RStudio interactive development environment for R.

Users can run desktop applications such as MATLAB or RStudio as if running on a laptop, but with all the power of Flux, as opposed to using them in batch mode or via text-only interactive sessions. Users can also use notebooks which require more processing power or memory than are available on their local computer or tablet (currently, Python and R notebooks are available).

ARC Connect is an enhanced version of the TACC / XSEDE Visualization Portal, and has been made possible at the University of Michigan through a collaboration between ARC Technical Services and the Texas Advanced Computing Center at the University of Texas.

For information on how to use ARC Connect, visit arc-ts.umich.edu/arc-connect. If you need further help, contact hpc-support@umich.edu.