Skip to content

Latest commit

 

History

History
177 lines (120 loc) · 9.38 KB

A_Syllabus.md

File metadata and controls

177 lines (120 loc) · 9.38 KB

Introduction to Computational Tools for Social Science

This course will provide graduate students the technical skills necessary to conduct research in computational social science and digital humanities, introducing them to the basic computer literacy, programming skills, and application knowledge that students need to be successful in further methods work.

The course is currently divided into five main sections. In the first section, students learn how their computers work and communicate with other computers using git and bash. In the second, we turn our attention to the basics of R and Python. In the third, students learn tools for acquiring data through APIs and webscraping. In the fourth, students will practice using R to clean and analyze data efficiently. In the fifth, students will be exposed to additional means of analyzing and visualizing data, including tools like text analysis and machine learning.

Please note that materials are still in development, and will be changing.

Objectives

  • Understand basic programming terminologies, structures, and conventions
  • Navigate and operate effectively in a UNIX environment
  • Understand basic Git and GitHub workflows
  • Write, execute, and debug R code for novel data collection, cleaning, analysis, and visualization
  • Write and execute basic code in Python
  • Be familiar with the concepts and tools of a variety of computational social science / digital humanities applications
  • Be familiar with the basic guidelines around reproducible research, good scientific computing practices, and ethics/privacy/legal quandaries
  • Learn independently and train themselves in a variety of computational applications and tasks through online documentation (we will have access to Datacamp's courses for the duration of the semester)

Logistics

Instructor

Julia Christensen

[email protected]

Section Assistant

Anustubh Agnihotri

[email protected]

Times

Monday 2-4 pm Wednesday 2-4 pm

Location

122 Barrows

Office Hours

By appointment.

bCourses

We will use bCourses for communication (announcements and questions) and turning in assignments. You should ask questions about class material and assignments through the bCourses website so that everyone can benefit from the discussion. We encourage you to respond to each other’s questions as well.

GitHub

All course materials will be posted on Github at https://github.com/juliachristensen/PS239T_Fall2019, including class notes, code demonstrations, sample data, and assignments. Students are required to use GitHub for their final projects, which will be publicly available, unless they have special considerations (e.g. proprietary data).

Accessibility

This class is committed to creating an environment in which everyone can participate, regardless of background, discipline, or disability. If you have a particular concern, please come to me as soon as possible so that we can make special arrangements.

Course Requirements and Grades

This is a graded class based on the following:

  • Completion of assigned homework (50%)
  • Participation (25%)
  • Final project (25%)

Assignments

Weekly assignments will be due as follows:

Date Assignment
Thursday, August 29 Fill out survey
Tuesday, September 3 Bash/Unix/Git Online Tutorial
Wednesday, September 4 Submit proof of installation
Sunday, September 8 R datacamp tutorial(s)
Sunday, September 15 Tidyverse R datacamp tutorial(s)
Sunday, September 17 Python datacamp tutorial(s)
Sunday, September 22 Python datacamp tutorial(s)
Sunday, September 29 Database exploration
Sunday, October 6 Final project proposal
Sunday, October 13 API/Webscraping project
Sunday, October 20 Data cleaning project
Sunday, October 27 Data visualization project
Sunday, November 3 Final project update
November 25-December 4 Final project presentations
Wednesday, December 11 Final projects due

Assignment details can be found on bcourses. Unless otherwise specified, assignments should be turned in as pdf documents via the bCourses site.

Time will be provided in class for additional exercises. Any exercises that are not completed in class should be completed before the beginning of the following class.

Extensions

If contacted in advance, instructors are generally willing to provide extensions.

Class Participation

The class participation portion of the grade can be satisfied in one or more of the following ways:

  • attending the lecture and section (note that section is non-optional)
  • asking and answering questions in class
  • contributing to class discussion through the bCourse site, and/or
  • collaborating with the campus computing community, either by attending a D-Lab or BIDS workshop, submitting a pull request to a campus github repository (including the class repository), answering a question on StackExchange, or other involvement in the social computing / digital humanities community.

Because we will be using laptops every class, the temptation to attend to other things during slow moments will be high. While you may choose to do so, I do request that you think of your laptop screen as in the public domain for the duration of class time. Please do not load anything that will distract your classmates or is otherwise inappropriate to a classroom setting.

Final Project

The final project consists of using the tools we learned in class on your own data of interest. First- and second-year students in the political science department are encouraged to use this as an opportunity to gather data to be used for other courses or the second-year thesis. Students are required to write a short proposal by October 6 (no more than 2 paragraphs) in order to get approval and feedback from the instructors.

During the last few classes, we will have lightning talk sessions where students present their projects in a maximum 5 minute talk, with 5 minutes for class Q&A. Since there is no expectation of a formal paper, you should select a project that is completable by the end of the term. In other words, submitting a research design for your future dissertation that will use skills from the class but collects no data is not acceptable, but completing a viably small portion of a study or thesis is.

Class Activities and Materials

Attendance Policy

Students will be expected to attend every class. Absences may be approved if the instructors receive notice in advance and a reasonable explanation is provided.

Lecture

Both Monday and Wednesday classes will be required. Instead of separate lectures and sections, all classes will follow a “workshop” style, combining lecture and lab formats.

Curriculum Outline / Schedule

Date Topic Instructor
Wednesday, August 28 Introduction Julia
Monday, September 2 No class n/a
Wednesday, September 4 Bash/Unix/Git Julia
Monday, September 9 Intro to R Julia
Wednesday, September 11 Intro to R Julia
Monday, September 16 Intro to R Julia
Wednesday, September 18 Intro to Python Anustubh
Monday, September 23 Intro to Python Anustubh
Wednesday, September 25 Intro to Python Anustubh
Monday, September 30 APIs Julia
Wednesday, October 2 HTML + Intro to webscraping Julia
Monday, October 7 Webscraping Anustubh
Wednesday, October 9 Webscraping Anustubh
Monday, October 14 Data Cleaning Julia
Wednesday, October 16 Data Cleaning Anustubh
Monday, October 21 Data Visualization Julia
Wednesday, October 23 Data Visualization Julia
Monday, October 28 Organization & Collaboration Anustubh
Wednesday, October 30 Micro-lectures TBD
Monday, November 4 Micro-lectures TBD
Wednesday, November 6 Text analysis TBD
Monday, November 11 No class n/a
Wednesday, November 13 Text analysis TBD
Monday, November 18 Machine Learning TBD
Wednesday, November 20 Machine Learning Chris Kennedy
Monday, November 25 Presentations n/a
Wednesday, November 27 No class n/a
Monday, December 2 Presentations n/a
Wednesday, December 4 Presentations n/a

Computer Requirements

The software needed for the course is as follows:

  • Access to the UNIX command line (e.g., a Mac laptop, a Bash wrapper on Windows)
  • Git
  • R and RStudio (latest versions)
  • Anaconda and Python 3 (latest versions)
  • Pandoc and LaTeX

This requires a computer that can handle all this software. Almost any Mac will do the job. Most Windows machines are fine too if they have enough space and memory.

You must have all the software downloaded and installed PRIOR to the first day of class. If there are issues with installation on your machine, please contact the instructors for assistance.

See B_Install.md for more information.

Books and Other Resources

There are no official textbooks for this class. Readings will be light, and posted as part of the weekly homework assignments on bCourses. For the semester, we will have access to all of Datacamp's premium course materials (thank you Datacamp!).