We are very excited to announce the launch of a pilot Slurm cluster at the SSCC. Slurm (Simple Linux Utility for Resource Management) is a powerful system for managing and scheduling computing jobs that is very popular at large-scale research computing centers. The SSCC is currently running a pilot with a small Slurm cluster as we all learn more about it, but we anticipate that it will become the main way SSCC researchers run large research computing jobs.
The easy way to submit a job to Slurm is to log into Linstat and use the ssubmit command. For example:
ssubmit --cores=32 --mem=20g "stata -b do my_do_file"
Slurm will then find a server in the Slurm cluster that has 32 cores and 20GB of memory available, and run your Stata do file there. It’s possible your job might have to wait until those resources are free, but once it starts it is guaranteed exclusive use of them—no more having your job slow down or run out of memory because the server is busy. If you’re not sure how many cores and how much memory your job uses, Identifying the Computing Resources Used by a Linux Job will show you how to find out.
Right now the Slurm cluster consists of four servers with 44 cores and 384GB of memory each. If it gets busy (and we hope it will) we’ll move servers from the Condor and Linstat clusters into Slurm. We plan to create a Slurm cluster within Silo as well.
To learn more about the Slurm cluster and how to use it, read Using the SSCC Slurm Cluster. We will also hold three short training sessions on Slurm over the next two weeks, two of them online. Visit our training page for details and to register.
The SSCC’s Summer training schedule is now available. This includes introduction and data wrangling workshops in Stata, R, and Python and regression workshops in Stata and R. We especially want to highlight workshops on regression diagnostics in Stata and R. Researchers who need to run big computing jobs are encouraged to sign up for one of our workshops on running them at the SSCC, which will cover both Linux basics and Slurm.
We’ve also scheduled introduction and data wrangling workshops in Stata, R, and Python for the week of August 29th. All are welcome, but these workshops are especially for incoming graduate students. In our experience, the sooner most graduate students become competent at using statistical software the better (and no, what they learn as undergraduates is not usually sufficient for graduate work). If you are in contact with incoming graduate students, please make sure they are aware of these workshops.
With the permanent shift towards remote work, we realize that many of you rely on Winstat. The challenge is that when one person runs a big statistical job on Winstat it can slow down that Winstat server for everyone that’s using it—especially if the job runs the server out of memory. We’ve done a variety of things to address this, including the load balancing that moves you to the least busy server if you log out and log back in, temporary steps during the pandemic like making lab computers available remotely as alternatives to Winstat, and many others behind the scenes. But we’re planning two more, and the first especially should make a dramatic difference:
First, we want to separate the big jobs from the day-to-day interactive work most of you do on Winstat. We plan to do this by having a large number of small Winstat servers and a small number of large Winstat servers. The large Winstat servers will be similar to the current servers and intended for medium-size jobs (truly big jobs should move to the Linux servers). The small Winstat servers will be designed for the kinds of work you’d do on a laptop or desktop computer (if your computer had all the software Winstat does and fast access to the SSCC network drives). Since we’ll have more of the small Winstats you’ll be sharing a server with a smaller number of people—none of whom should be running big statistical jobs. This will give more consistent performance.
In the coming weeks we’ll launch a public beta test of the smaller Winstat servers (one class is already using them). You’ll see the new servers when you sign in to use Winstat, and we encourage you to try them out. Send any feedback to the Help Desk.
Second, we will begin to automatically check for and terminate jobs that are running the Winstat servers out of memory (and are in violation of our Server Usage Policy). The process will also send an email to the job’s owner. If you find your jobs are being killed, our Statistical Consultants may be able to help you reduce their memory usage. They can also help you learn to run your jobs on the SSCC’s Linux servers, which have much larger amounts of memory available.
The SSCC suspended its annual account renewal process during the first two years of the COVID-19 pandemic so as not to cut anyone off from essential computing services, but as things return to normal, we need to return to annual account renewals as well.
Lab users must renew their accounts by April 4th. To renew your account, lab users should receive an email with a customized link that you will use to renew your account.
Full members must renew their accounts by May 31st. To renew your account, full members can follow the link to our account renewal page and follow the instructions on the page.
If you have graduated or otherwise left UW-Madison, do not renew your account. If you go to the form and answer “No” to “Do you want your account renewed?” we won’t send you any more reminders.
If you have left UW-Madison but are still collaborating on a project here, renew your account. However, your collaborator should speak with the administrative staff in the agency that sponsors your account about continuing to sponsor it. Note that these decisions are made by our member agencies, not SSCC staff.