SSCC News September 2021

Welcome, New Members!

We want to extend a warm welcome to all the new members of the Social Science Computing Cooperative, whether you’re a new faculty member, staff member, or graduate student who will use our resources for research, or an undergraduate taking a class that uses SSCC resources.

What is the SSCC?

The SSCC provides servers, software, training, and consulting to support researchers (and future researchers) who do statistical analysis. If you didn’t attend an orientation session, feel free to email the SSCC Help Desk, tell us about yourself, and ask what we can do for you.

What is SSCC News?

SSCC News is one of our main ways of getting information to our members. It comes out about once every two months. Please look over the email when you get it and then read the articles that will affect you.

If you’d rather not receive SSCC News, email helpdesk@ssc.wisc.edu and they can take care of that for you. If you’re no longer interested in SSCC News because you no longer use your SSCC account, they can close it for you.

SSCC Fall Training

SSCC’s fall training is underway, but it’s not too late to learn Stata or R, including reviews of regression in both languages. We also have a lot of workshops that will be of interest to veteran researchers:

  • Learn to run jobs on SSCC’s Linux servers, where they’ll have much more computing power and can run for days or weeks if needed
  • Use Globus to move data into Silo
  • Loops make writing code much, much faster by eliminating repetition. We’ll teach you how to use them in R or Stata.
  • Shiny allows you to make interactive data visualizations for the web using R (like the COVID-19 dashboards we’ve all spent too much time staring at). Think Tableau, but free.
  • Learn to collect data directly from the web using R with web scraping
  • Stata 17 has new tools for tables that are much more flexible than esttab or outreg2
  • Multiple Imputation of missing data is not a technique we particularly recommend as it’s a ton of work and easy to get wrong, but in this grad-student-requested workshop we’ll help you get it right

Visit the training page for details and to register.

PyCharm Available On SSCC Servers

As part of the SSCC’s summer update, we installed PyCharm on all of the SSCC’s interactive servers. PyCharm is a powerful development environment for writing Python code. It is comparable to Spyder, but more sophisticated. While Jupyter Notebooks are an excellent tool for teaching, communicating results, and simple tasks, for tasks of even moderate complexity we suggest writing your code in something like PyCharm.

Malicious Python Packages found on PyPI

In other Python news, security researchers recently identified eight malicious Python packages in the popular PyPI repository (the default source used by conda and pip). Six of them were designed to steal the personal information of the programmer who downloaded them, but two were designed to infect any computer the resulting code was run on. In principle similar things could be done with R’s CRAN or Bioconductor repositories or Stata’s SSC library, but they are smaller and less rewarding targets.

Installing well-known Python packages, like PyTorch or scikit-learn, is reasonably safe. But if you’re thinking about installing something that’s off the beaten path, please take a few minutes first to google it and make sure it comes from a reputable source.

Ars Technica has more information about the incident.