Linux Background Jobs and Scripts

There are many advantages to running your compute jobs on the Statistics department Linux machines. Not only are the machines fairly powerful, but you are able to log out and have the job continue to run. (Do remember to adhere to the Statistics department's policies on allowed number of jobs, which server(s) you are able to run them on, and breakpoint requirements. Failure to follow departmental policies will result in your jobs being killed.) Although this article uses Matlab as an example, the same procedure can be followed for C, R, etc. 

The first step is to ssh into the server you want to use. There is a full description of how to do this here

  1. Starting a Background Job
  2. Killing a Background Job
  3. Shell Scripts
  4. Resources

Starting a Background Job

Once on a Linux machine, the following command will open, for example, a background matlab job:

stat-pippin$ matlab -nodesktop -nodisplay < file.m &> file.out &

("stat-pippin$" is just the prompt with the hostname that you will see.)

matlab is the command to start Matlab. Since we are not working graphically (you can't display graphics in a background job), we add the -nodesktop -nodisplay options so that Matlab doesn't try to start graphics and crash.

The < tells the shell to send input from a file to Matlab. In this case we used file.m, but it can be called anything. It should contain the Matlab commands you want to execute. The > sends Matlab's output to a file; namely, in this case, file.out. The & before the > is optional, but it will allow errors sent to be sent to file.out as well so it is good to include it.The final & puts the job in the background so you don't have to stay logged in for it to run.

While a job is running, you can log in to the machine and type top to see all the processes that are running (q quits top). You can view the contents of the output file while the job is running by typing more file.out. Also, the tail command can be useful if you just want to see the last few lines of the output file. If your job is going to write a lot, it will speed up if you use the /scratch file system instead of your home directory, as scratch is local to the machine and not networked. Just copy your results later to your home directory as scratch is not backed up.

Note, if you are using the time command you must add { } around all the commands you want the output redirected for, like so

rohan$ { time matlab -nodesktop -nodisplay < file.m ; } &> file.out & 

Important: Use the exit command when exiting the shell; if you simply close the terminal window, bash assumes you've been suddenly disconnected and attempts to kill any child processes. This will include your background job.

Killing a Background Job

Occasionally, for whatever reason, you might need to end your jobs while they're running. This is easily done with the kill command. The kill command takes a PID, or process ID (which is a unique identifier every process has) and sends a signal to terminate that process.

The first thing you want to do is find out what ID of your shell process. To find out what TTY (from teletypewriter, which was an early name for the terminal) you're connected through, type tty. This will return something like /dev/pts/#, where # is some integer. For example, my result is /dev/pts/2. Remember this as it will be useful in a minute.

Next we'll use ps to find all of our processes: 

stat-mordor$ ps

  PID TTY          TIME CMD
21417 pts/2    00:00:00 bash
21514 pts/2    00:00:01 MATLAB
21591 pts/2    00:00:00 ps

Remember that we definitely don't want to kill the shell running on our TTY, since that would log us out. In this case, that's the first process shown: bash. We can tell this because its TTY is the same as our command earlier told us ours was, pts/2. We might, however, want to terminate Matlab. We can see that it has a PID of 21514, so we invoke the kill command with that PID, i.e. kill 21514. Depending on a number of factors, it's possible that the command will execute but not actually stop the process. For this reason it's a good idea to try to use the same command again (you can access old commands by pressing up on the command line). If you receive the following message

-bash: kill: (21514) - No such process

then you will know that the process was successfully killed. If not, the process was not terminated and you will probably want to try a "stronger" kill command. In that case, change your command to kill -9 21514 of course replacing 21514 with the PID you're trying to kill. This sends a different signal, a non-catchable and non-ignorable kill signal, instead of the default SIGTERM signal. You can read more on signals here.  

Shell Scripts

It can be very useful to have several processes automatically execute in sequence, like a set of instructions or a program. This is where shell scripts come in. They are a series of commands that are executed at specified times, usually sequentially. Shell scripts can get very complicated, but they need not be so. Consider the following segment:

# !/bin/bash

  echo 'Somebody's poisoned the waterhole!' 

It might not look like much, but it's a perfectly valid script; just run it and see! Before we run it, we have to make sure we have permission to execute it. We can do this with the chmod command. Let's say we want read, write, and execute permissions on this, but we don't want anyone else to be able to do anything with it. In that case, the proper permission combination would be 700. Assuming we saved our script as my_script.sh (.sh file extension is important so the shell knows it's a script), we'd use chmod 700 my_script.sh. Once we've done that, we can start the script by typing ./my_script.sh. Assuming there are no errors, the script should execute and print the quote in the console.

Admittedly, this script is a little on the dull side. However, consider something like this: 

#!/bin/bash

Splus BATCH file.s
a.out < infile1 &> outfile1
sas < prog.sas
matlab -nodesktop -nodisplay -nojvm -nosplash < infile.m &> matlab.out

Now we're getting somewhere. And this is only the tip of the iceberg about how powerful scripts can be. 

Resources

There are a huge number of resources available online that will help answer any questions you have about Linux or background jobs. Here are just a few recommendations.

  • Writing Shell Scripts is a good tutorial for beginners with shell scripts. It eventually gets into flow control (logic) and beyond. 
  • StackOverflow This is an excellent resource for any computer related questions.
  • Your local IT department
  • In the right column of this page is a PowerPoint presentation done about our compute servers. This has information on departmental policies and other useful server-related information.

 

 

Details

Article ID: 34528
Created
Fri 7/28/17 1:37 PM
Modified
Sun 3/31/24 2:00 PM