High Performance Computing and R – WSU’s Kamiak Cluster

This week we had a guest speaker, Jeff White from IT, who discussed accessing the Kamiak High Performance Computer on campus (slides can be found here). We also discussed creating .csv files and getting that data into R.

Kamiak is a computer may be accessed by any student with an approved access. Access can be set up by contacting CIRC, the Center for Institutional Research Computing, which runs Kamiak through their Service Desk. You will need to make an account first and your adviser or project PI will need to vouch for you.

Kamiak is a large computer, or “cluster” of smaller computers which work in tandem. Kamiak is a Linux system – what that means functionally is you access it through what is called the “secure shell”, or ssh. This is an interface which communicates with the computer remotely, so you load it up on your personal computer and then can run programs and software on the Kamiak computer. It is not a point and click system, but rather one that is done by coding, in this case Linux. Information on how to install or open ssh software onto your own computer can be found here: https://hpc.wsu.edu/users-guide/terminal-ssh/.

Once you have the ssh running, and an active Kamiak account, you log into the computer using your WSU credentials. There are a vast number of commands you can use to communicate with the computer – here is a good resource for learning Linux in general, which goes over both the “secure shell” and how to write scripts to run programs: http://linuxcommand.org/.  From Jeff’s lecture there were a number of quick commands that he used which I have summarized below and on our Resources page.

On Kamiak, the primary way of navigating files and “jobs” (programs the computer is running) is through using a scheduling software called “slurm”. The following commands all have an “s” in the front because they refer to slurm specific commands – they are not generic linux commands, though in many cases those work too. For more information see the entire Training PDF.

sinfo #shows what CPUs are available to use
sbatch #creates job
scontrol #shows jobs
scancel #cancels jobs. Example, to cancel job humner 345: scancel 345
sq #shows all of your running or pending jobs. 

#Other commands
idev #opens up an interactive interface to run programs without an writing .sh script and submitting it to the computer 
cat slurm -Job #looks at a specific job number. Example: cat slurm -345

In general, Kamiak and Linux systems work where you write a “script”, basically a set of commands for the computer to do on its own, and then you submit that script to the computer and look at the results after. These script files are .sh files and can be written in a number of different programs, called text editors. A basic one that is relatively simple, and which can be edited and created in Kamiak through the “vim” command, is vim. Once you have written the instructions into your .sh file, you move the file, and any associated data, to Kamiak and you tell Kamiak to run it. Kamiak will run it as commanded and then the output will be saved where ever you have directed it to save. A great example of running a file, and of a simple Kamiak .sh script, can be found on the Kamiak website here.

To move files to and from  Kamiak there are a few different ways. For Mac or Linux users it can be done relatively easily as there are built in programs that let you transfer files. For Windows users a great program to use is WinScp. This program lets you use it either through the command line (aka the code) or through a point and click interface. All of these programs work where you first connect to Kamiak from your computer, then move files, then disconnect.

Here is an example of creating a .csv file in R, then moving it to Kamiak, on a Windows computer. Mac and Linux users will have similar experiences.

Creating the File

Connecting to Kamiak to transfer the file using WinScp

Connecting to Kamiak, note the name of Kamiak and the port number.

Moving the file

Using R on Kamiak

When using R on Kamiak it is important to create a default space for packages to install to on your own home directory. Our own Tung Nguyen has created one for us that is on the Kamiak website: https://hpc.wsu.edu/r/