Text as Data in Python and R
March 22 @ 10:00 am - 12:00 pm
Machine learning is opening new research opportunities across academic disciplines, as it can be used to explore new data types to enrich the current state of the art. In this workshop, we will analyze a text corpus to demonstrate the use of machine learning for natural language processing. In the first half of the workshop, we will provide a basic overview of machine learning, introduce the main concepts and logic of using text as data, and walk through a typical workflow for processing, managing and analyzing a text corpus. We will discuss how to choose between two popular languages in the text analysis domain – Python vs. R – and how to interpret the results from a topic model. Participants who are interested in actively implementing text analysis methods in their own computational work will be invited to attend the second half of the workshop. Here, instructors will demonstrate in two concurrent hands-on tutorials how the topic modelling example from the first half of the course was accomplished in either Python or R. A basic familiarity with the chosen programming language is expected in this half of the workshop. Participants who attend the first part of the workshop will walk away with a basic overview of the capabilities and methods for using text as data. Participants who attend the entire workshop will be equipped with basic programming tools to apply natural language processing in their own research. The workshop will also cover helpful resources for machine learning implementations, such as data sets, storage space, high performance computing, and consultation services at the University of Michigan.
Machine Learning Specialist
Information and Technology Services – Advanced Research Computing
Meghan Dailey is a machine learning specialist in the Advanced Research Computing (ARC) department at the University of Michigan. She consults on several faculty and student machine learning applications and research studies, specializing in natural language processing and convolutional neural networks. Before her position at the university, Ms. Dailey worked for a defense contractor as a software engineer to design and implement software solutions for DoD-funded artificial intelligence efforts.
Center for Political Studies, Institute for Social Research and Information and Technology Services – Advanced Research Computing
Jule Krüger, Ph.D., is the ISR Program Manager for Big Data and Data Science, based within the Center for Political Studies at the Institute for Social Research, and a member of the Advanced Research Computing Consulting Services. She has more than 10 years of experience in processing, analyzing and interpreting data for social science research, and automating workflows for scalable, auditable and reproducible analysis.
A Zoom link will be provided to the participants the day before the class. Registration is required.
Instructor will be available at the Zoom link, to be provided, from 9:00-10:00 AM for computer setup assistance.
Please note, this session will be recorded.
To register and view more details, please refer to the linked TTC page.
If you have questions about this workshop, please send an email to the instructor at email@example.com