Logging In

By |

To log in to the Cavium Hadoop cluster, you need a terminal.  Currently the cluster is only accessible via the command line.

If you are trying to log in from off campus, or using the MGuest wireless network, you have a couple of options:

    • Install VPN software on your computer
    • First ssh to login.itd.umich.edu, then ssh to cavium-thunderx.arc-ts.umich.edu from there.

Here’s what a login looks like using a terminal emulator:

Mac using terminal: Open terminal

Type: ssh -l uniqname cavium-thunderx.arc-ts.umich.edu [replacing your uniqname in the command]

Windows using PuTTY (http://www.chiark.greenend.org.uk/~sgtatham/putty/).

Launch Putty and enter cavium-thunderx.arc-ts.umich.edu as the host name then click open.

For both Mac and Windows:

At the “Enter a passcode or select one of the following options:” prompt, type the number of your preferred choice for Duo authentication.

Creating an Account

By |

Using the Cavium Hadoop cluster requires an ARC user login. Please fill out this form to request a login if you do not already have one.
When you create an account, you will automatically be added to our default queue. If you are using our Hadoop cluster for a class or another specific purpose, or you need a significant allocation, please open a ticket with us so you can be added to the appropriate queue. In many of the examples in this user guide, there is a “–queue <your_queue>” flag. Please fill in the name of your queue when running this examples, or simply “default” if you do not have one.

Overview

By |

The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across a cluster of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures. (From hadoop.apache.org)

The software available is:

Please see our workshop training material: