The SSCC offers both classes and an online curriculum you can work through on your own. You can view the current training schedule and register for classes, or find our online curriculum and learn more about the SSCC’s training program below.
Introduction to Stata will teach you the fundamentals of how Stata works and why. It (or comparable experience) is a prerequisite for the rest of SSCC’s Stata training, but it will also prepare you to excel in classes that use Stata.
- Using Stata
- Structure of a Stata Data Set
- Elements of Stata Syntax
- Do Files
- Creating and Changing Variables
Data Wrangling is the process of preparing data for analysis, which includes importing, cleaning, recoding, restructuring, combining, and anything else data needs before it can be analyzed. Data wrangling is a critical skill for research.
In this class you’ll learn how to wrangle data using Stata. We’ll cover some of the key concepts and workflows of data science as well as the structure and logic of Stata. We’ll emphasize real-world issues like handling missing data and checking for errors, as well as best practices for research computing and reproducibility. Our goal is to give you a strong foundation you can build on to become an expert data wrangler.
- Introduction and Review
- Reading in Data
- First Steps With Your Data
- Variable Transformations
- Hierarchical Data
- Restructuring Data Sets
- Combining Data Sets
- Project Management
- Learning More
Stata Workshops cover a variety of topics needed by many but not all researchers.
R with RStudio Basics is a quick introduction to the basics of R, using RStudio.
Data Wrangling is the process of preparing data for analysis. This course will cover importing data, cleaning data, creating and transforming variables, extracting and merging data, and reshaping data.
- Defining Data
- Wrangling Vectors
- Wrangling Data Frames
- Restructuring with Base R
- Restructuring with the tidyverse
R Workshops will cover a variety of topics needed by many but not all researchers.
Data Wrangling is the process of preparing data for analysis, which includes importing, cleaning, recoding, restructuring, combining, and anything else data needs before it can be analyzed. Data wrangling is a critical skill for research. This course teaches wrangling skills, mostly using the data wrangling tools of the Pandas package in Python. Pandas is a collection of functions/methods for working with data comparable to R’s tidyverse.
This course will cover importing data, cleaning data, creating and transforming variables, merging data, and basic data visualization. It is a hands-on class with time devoted to practicing using these tools to ready data for analysis.
The primary purpose of the SSCC’s training program is to give social science graduate students with an interest in quantitative research the skills they need to do research with real-world data. It is intended to complement formal coursework in statistical analysis, so there is a heavy emphasis on the data wrangling skills that usually are not taught in such classes. The curriculum was developed by the SSCC’s statistical consultants and draws on their long experience assisting social science graduate students as they begin their research.
While the primary audience for SSCC training is social science graduate students, the skills taught will be valuable to a much broader audience: graduate students in other fields, faculty and staff researchers interested in enhancing their skills or learning a new statistical package, or undergraduates who are interested in graduate school, data-driven careers, or just gaining a deeper understanding of statistical software so they can excel in classes that use it. SSCC’s training is free to all UW-Madison faculty, staff, and students.
The core curriculum is taught four times per year: just before the start of the fall semester, during the fall semester, during the spring semester, and during the summer. However, all the class materials are available online, so students can start at any time and work at their own pace if they choose.
Training is offered in Stata, R, and Python (with the Python curriculum still in development). Most researchers will use a single statistical package in their work, and while we don’t discourage anyone from learning multiple packages, mastering one package will allow you to do more than having a beginner’s understanding of multiple packages.
In choosing what package to learn, first consider what package will be used in the classes you need to take. As of this writing, the graduate programs in Sociology and the LaFollette Institute plus the Economics Master’s degree program mostly use Stata, while the Economics PhD program teaches Julia and R is most common elsewhere. Then identify what package is most commonly used by researchers who do work similar to the work you anticipate doing. Normally that means the package is a good fit for that kind of work, and at any rate using the same package will make it much easier to collaborate.
The sooner you learn to use a statistical package the sooner you can benefit from it. However, if you don’t use it regularly you’ll forget what you learned. In general we suggest graduate students take our core training during their first year, but if you won’t use it for class work you might consider waiting until just before you start research. If possible, we recommend taking our training on a statistical package before taking a class that uses it. That way you’ll already know the software and can focus on the class material.
SSCC’s statistical consultants also occasionally offer workshops outside this curriculum, such as introductions to other software or particular statistical methods. They are also available for guest lectures. If you have a suggestion for a workshop or would like to arrange a guest lecture, contact the SSCC Help Desk.