connecting to remote machine to run long jobs
connecting to remote machines: ssh, scp
ssh
: secure shell
ssh ane@stat-remote.stat.wisc.edu # need to authenticate. password: # characters not shown
hostname
logout
hostname # just in case I switch a lot and don't remember
message after first login to a machine:
RSA key fingerprint is SHA256:xxx.
Are you sure you want to continue connecting (yes/no)?
if yes, information stored in ~/.ssh
folder:
less -S ~/.ssh/known_hosts
whoami
ssh lunchbox.stat.wisc.edu # login name not needed if same as on local machine
cd private/st679/
ls
emacs -nw notes/statservers.md # or nano or vim: no new window. ^X^C to quit emacs.
logout
To use an editor that needs a new window: need to enable X11 forwarding. can be a headache, and can makes things a lot slower.
To avoid typing your password each time: copy the public key from your laptop
(in id_rsa.pub
) to the file ~/.ssh/authorized_keys
on the remote server.
You created a key pair earlier for github, and copied your public key to your
github profile.
scp
: secure copy. Same as cp
, but need to provide user name,
machine name and full path on the remote machine. works over ssh.
scp -p ane@stat-remote.stat.wisc.edu:private/st679/notes/statservers.md notes/
ls -l notes/statservers.md
scp -r stat-remote.stat.wisc.edu:private/st679/classroom-repos/hw1/log .
ls -l
ls -l log/
echo "hi Cecile" > coolfile
scp coolfile ane@stat-remote.stat.wisc.edu:private/ # works both directions
slight difference between cp and scp:
cp -r log/ target/path/
copies the content of the log/
directory
cp -r log target/path/
copies the directory itself and its content
rm -r log
scp -r stat-remote.stat.wisc.edu:private/st679/classroom-repos/hw1/log/ .
ls
rm -r log
cp -r classroom-repos/hw1/log .
ls
rm -r log
cp -r classroom-repos/hw1/log/ .
ls
rm *.log
long-running jobs: nohup
Previous example of long-running job: get data and link to program here
cd classroom-repos/lecture-examples/mrbayes-example/
ls
head -n 17 alignedDNA.nex
head mrBayes-run.nex
mb mrBayes-run.nex > screenlog &
tail -f screenlog # control-C to stop
Now, I want to run this on the stat servers, log out, log back in tomorrow to get the results.
- log out: exit the terminal, all jobs started from it will be sent
a “hang up” signal (SIGHUP), and killed.
(recall: SIGKILL with
kill -9
, SIGINT with control-C, SIGTSTP with control-Z) nohup
: will catch and ignore this hang-up signal
ssh stat-remote.stat.wisc.edu
cd private/st679/classroom-repos/lecture-examples/mrbayes-example/
mb mrBayes-run.nex > screenlog &
logout
ssh stat-remote.stat.wisc.edu
ps -u ane | grep mb # gone
cd private/st679/classroom-repos/lecture-examples/mrbayes-example/
tail screenlog
let’s do this again but with nohup
before our command:
rm -f alignedDNA.nex.* screenlog # clean up
nohup mb mrBayes-run.nex > screenlog &
logout
ssh stat-remote.stat.wisc.edu
ps -u ane | grep mb # still there if lucky
cd private/st679/classroom-repos/lecture-examples/mrbayes-example/
ls -l
rm -f alignedDNA.nex.* screenlog
but nohup
does not work well on our stat servers, actually.
AFS (Andrew file system): has a very strong authentication system,
uses “tokens” to grant permissions, issued from Kerberos “tickets”.
tokens # has expiration date
klist # first: ticket-granting ticket, second: AFS token
When I log out, I lose my token, and my permissions to write to files, so my process runs but will have an error as soon as it will want to write to or read from a file.
tmux: terminal multiplexer
solves several challenges, like GNU screen
tmux
sessions can be detached and reattached: detach, log out, log back in, re-attach- multiple windows in a session: say one with an editor, and another for shell commands
do which tmux
to see if you already have tmux
installed. if not:
brew install tmux
on a Mac,
sudo apt-get update
then sudo apt-get install tmux
on Ubuntu.
check your tmux configuration: use this
.tmux.conf file
to change the default prefix to ^a
(to be the same as screen
, otherwise default is ^b
).
cat ~/.tmux.conf # if you don't have this file: create it
emacs ~/.tmux.conf # uncomment last 2 lines
In this file, I suggest that you uncomment the last 2 lines:
to split panes more intuitively.
Now run tmux:
tmux new-session -s mb-analysis # new screen shows up
echo 'hi cecile!'
pwd
# ^a d to detach the session
tmux list-sessions
tmux attach
# ^a d to detach again
tmux attach -t mb-analysis
# ^a d to detach yet again
logout
ssh stat-remote.stat.wisc.edu
tmux list-sessions # mb-analysis still there
ps -u ane | grep tmux # still running
tmux attach
logout # to start new session (lost token)
tmux windows and panes
tmux new-session -s mb-analysis
cd private/st679/classroom-repos/lecture-examples/mrbayes-example/
ls
rm -f alignedDNA.nex.* screenlog # clean up things that didn't finish
emacs mrBayes-run.nex # change ngen=1000000 to 5000000
# ^a c to create a new window
pwd # same directory as previous window
nano readme.md # document change in mb options, etc.
# ^a c creates a 3rd window
mb mrBayes-run.nex > screenlog &
# ^a | new pane: splits vertically
now let’s switch between windows and between panes:
tmux keys | action |
---|---|
^a d |
detach |
^a c |
create new window |
^a | or ^a - |
splits window vertically or horizontally (depends on config file) |
^a n or ^a p |
go to next or previous window |
^a left arrow |
go to left pane. other arrows for right, top, bottom panes |
^a ? |
list all key sequences |
^a & , exit or logout |
kill current window |
^a x |
kill current pane |
then detach and check on our session.
long jobs on stat.wisc.edu machines
See here,
where you can replace screen
by tmux
.
idea: run a stashticket
command to maintain the AFS token before starting
a tmux
(or screen
session), and run an ssh
session inside of it.
- one simple 1-window
tmux
session in which Issh
. - If I need multiple windows, then run another
tmux
session inside.
running many jobs
- high performance: one (or few) very long job(s), possibly requiring a lot of memory
- high throughput: very many job, each short (<24h). need program to distribute the jobs across very many different machines, and to get all output files back from these machines at the end.
resources on campus:
- ACI (advanced computing initiative) and CHTC (center for high-througput computing) with HTCondor for job scheduling
- HPC cluster “lunchbox” in Statistics: uses slurm for job scheduling