XSEDE Big Data training available at U-M — Sept. 2

September 2, 2014 @ 12:00 am

The Extreme Science and Engineering Discovery Environment, or XSEDE conducts a series of hands-on workshops to provide a convenient way for researchers to learn about the latest techniques and technologies of current interest in HPC. The University of Michigan will host an upcoming session on Big Data: September 2, 2014 11 a.m. – 5 p.m. EST 3100 North Quad

11 a.m. – 1 p.m. Big Data Programming with Hadoop and Spark This session will give an overview of programming big data applications focusing on Hadoop and Spark. I. Hadoop System Overview This section will cover the basics of the Hadoop Environment. We will discuss the Map Reduce daemons, the scheduling and monitoring environment, and interacting with the distributed file system (HDFS). II. Hadoop Jobs We will write a simple Java Map/Reduce program and run through the process of compiling, packaging, submitting, monitoring, and collecting the output of a Hadoop job. We will also briefly discuss other applications that run on the Hadoop platform such as HBase and Hadoop Streaming. III. Spark We will discuss the Spark platform and its concept of Resilient Distributed Datasets. We will cover the relationship between Spark and Hadoop, and we will write and submit an example job. We will also discuss the Spark Machine Learning API. 2 – 5 p.m. Urika Training
  • Learn the Graph Analytic approach to Data analysis, including some real-world examples.
  • Gain an introduction to the RDF data format and the SPARQL query lanquage, with hands-on practice.
  • Learn how to interact with the Sherlock Urika system.
The Ehrichler Room is on the 3rd floor, Room 3100, of North Quad and can be located on Path #2 entering the Thayer/Washington Plaza of the directions.

To register to attend the workshop, visit the XSEDE registration page. Seating is limited


