Just as fluency in statistics is now considered a fundamental skill in many academic fields, organizers of the interdisciplinary Big Data Summer Camp think that the ability to extract and analyze data from the Web will soon be indispensable to researchers across the social sciences.
Graduate students from several different disciplines spent a week in May in a classroom in the Ross School of Business learning some of the programming and database tools necessary to do research using this source of big data. “Five years from now, every social science discipline is going to have these tools built into their training,” said Professor of Management & Organizations and director of the Interdisciplinary Committee on Organizational Studies (ICOS) Gerald Davis. “You will need to know this stuff because the data is out there. We think there will be required big data classes, and we’re piloting what that might look like.” This is the second year for the camp, which is sponsored by ICOS, Advanced Research Computing at U-M (ARC), and LSA IT Advocacy and Research Support. Davis said attendance was up to 60 from about 35 last year, and there was also a waiting list of about 15. Participants were graduate students from economics, public health, education, information, business, sociology, psychology, urban planning, and political science. “We want to create communities among students that cross boundaries of disciplines and schools,” Davis said. “This is a concrete way to do that.” Students were split up into interdisciplinary teams and given the assignment to find “one plausibly true thing” from the oceans of data on the internet. The tools and skills they learned included SQL databases, Python, and application programming interfaces (APIs). Some of the resulting projects were examinations of the factors that influence online beer reviews; the differences between wealthy people in China and the U.S.; the characteristics and behavior of Democratic vs. Republican senators on Twitter; and how crime patterns in Ann Arbor are affected by U-M football games. Denise Lillvis, a graduate student in Public Health and Political Science, said the skills she learned in the camp will help improve her research, which focuses on the relationship between the professionalism of state health department employees and the health outcomes of their clients. “(Before the workshop) I didn’t know how to get an API to do what I wanted it to do,” she said. “Now I feel more confident in thinking about what I can do, and being hooked up with a hub of resources will definitely help.” Davis said the workshop’s pedagogy has dramatically improved since last year. He credited the improvement to Matt Burton, a graduate student in the School of Information, who attended a big data “train-the-trainer” session put on by Mozilla this spring in Toronto. Burton was hired to facilitate the summer camp this year. Burton said the training in Toronto essentially was about “how to teach programming to non-programmers,” and that there is growing demand for Big Data skills in the social sciences. “I think there is a need in humanities as well, although it’s lagging a little behind,” he added. “I think it cuts across domains.” “Over the long term, what’s going to happen is the same thing that happened as statistics moved into the disciplines,” he added. “We’re going to have programming courses taught in the disciplines, but right now there’s a massive shortage of teachers.” The organizing committee consisted of Davis; Jason Owen-Smith, Professor of Organizational Studies and Sociology; Brian Noble, Associate Dean for Undergraduate Education and Professor of Electrical Engineering and Computer Science; and Cliff Lampe, Associate Professor of Information, School of Information. Davis said the organizers hope to keep growing the program, and to develop a template for workshops or courses that can be taught independently. “We want to create a recipe for anyone to use,” he said. ______________________________________________ To connect to researchers using big data at U-M, consider joining the ICOS Big Data Users Group, or IBUG, which was created as a continuation of the Big Data Summer Camp by contacting Matt Burton (firstname.lastname@example.org), Eric Seymour (email@example.com) or Russell Funk (firstname.lastname@example.org). The group holds 4-5 meetings per semester, or roughly one a month, that last about 90 minutes. Generally, for each meeting the group invited a speaker (sometimes a regular member of the group, sometimes a new person) to talk about his or her work and its connections to big data, programming, and related topics. The speakers spanned a wide range of experience levels. The group provides a great opportunity for informal knowledge sharing and discussions about technical problems that members of the group have in their research.