This workshop will provide an overview of how to scrape data from html pages and website APIs using Python. This will mostly be accomplished using the Python requests, beautifulsoup, retry modules and the browser developer tools. The workshop is intended for users with basic Python knowledge. Anaconda Python 3.5 will be used.
This workshop will introduce participants to Python’s Pandas. We’ll start with a brief explanation of Anaconda and the Jupyter notebook environment (although not required for the participant, the instructor will be using these tools). After a brief introduction to main Python’s standard data types as well as Pandas data structures, we’ll demonstrate how to retrieve information from Pandas Series and DataFrames. We’ll also demonstrate basic input/output, selection, dropping, sorting, ranking, grouping and apply operations. Although not required, we recommend all participants to have a basic knowledge of Python.
This workshop will provide a quick overview of natural language processing using Python. We’ll cover the basics. Segmenting text into tokens, assigning part-of-speech, assigning dependency labels, detecting and labeling named-entities. We’ll also cover sentiment analysis, topic modelling and maybe some visualizations. The workshop will be conducted in Python and is intended for users with basic Python programming knowledge. Anaconda Python 3.5 and a Jupyter Notebook will be used.
Modern computers have a CPU with multiple cores (usually between 4-8). Come learn how to take advantage of them to parallelize and speed up your code. We’ll show you how to structure your code so you can parallelize it in 5 lines or less. We will also cover some theory, a few practical considerations along with some basic exercises. We’ll be using the multiprocessing module in Python. The workshop is intended for users with basic Python knowledge. The workshop assumes you know how to do the following in Python: i) write a for loop, ii) write a function that has inputs and outputs. Anaconda Python 3.5 will be used.
This workshop will provide an overview of how to scrape data from html pages and website APIs using Python. This will mostly be accomplished using the requests, beautifulsoup, and retry modules with the browser developer tools. The workshop is intended for users with basic Python knowledge. Anaconda Python 3.5 will be used.
This workshop will delve into common data processing and exploration techniques. We will use Pandas to perform data exploration in Python. Among others, we’ll demonstrate how to load data files, sort data, group variables, merge/join datasets and create common plots. Although not required, we recommend all participants to have a basic knowledge of Python.
Asst. Prof. Emanuel Gull, Physics, is offering a mini-course introducing the Python programming language in a four-lecture series. Beginners without any programming experience as well as programmers who usually use other languages (C, C++, Fortran, Java, …) are encouraged to come; no prior knowledge of programming languages is required!
For the first two lectures we will mostly follow the book Learning Python. This book is available at our library. An earlier edition (with small differences, equivalent for all practical purposes) is available as an e-book. The second week will introduce some useful python libraries: numpy, scipy, matplotlib.
At the end of the first two weeks you will know enough about Python to use it for your grad class homework and your research.
Special meeting place: we will meet in 340 West Hall on Monday September 11 at 5 PM.
Please bring a laptop computer along to follow the exercises!
Syllabus (Dates & Location for Fall 2017)
- Monday September 11 5:00 – 6:30 PM: Welcome & Getting Started (hello.py). Location: 340 West Hall
- Tuesday September 12 5:00 – 6:30 PM: Numbers, Strings, Lists, Dictionaries, Tuples, Functions, Modules, Control flow. Location: 335 West Hall
- Wednesday September 13 5:00 – 6:30 PM: Useful Python libraries (part I): numpy, scipy, matplotlib. Location: 335 West Hall
- Thursday September 14 5:00 – 6:30 PM: Useful Python libraries (part 2): 3d plotting in matplotlib and exercises. Location: 335 West Hall
Users of ARC-TS computing resources can now access the Internet from compute nodes.
Normally, compute nodes on ARC-TS clusters cannot directly access the Internet because they have private IP addresses. This increases cluster security while reducing the costs (IPv4 addresses are limited, and ARC-TS clusters do not currently support IPv6). However, this also means that jobs cannot install software, download files, or access databases on servers located outside of University of Michigan networks: the private IP addresses used by the cluster are routable on-campus but not off-campus.
If your work requires these tasks, there are now three ways to allow jobs running on ARC-TS clusters to access the Internet, HTTP proxying, SOCKS proxying or SSH tunneling.
The most common need for this would be for users who want to install Python or R packages from inside running jobs, or users who want to download or upload data from inside running jobs.
See Accessing the Internet from ARC-TS Compute Nodes for more information.