File Transfers
Here are some tips to help you get your data transferred to and from the cluster’s file system.
Important
When considering your source or destination in the text below, remember that gruffalo
- because it’s accessible from anywhere (see Getting Connected) - can be both, but your local client may be firewalled and therefore only able to act as a source.
Command line options
There are several tools you could use, but two of the most common are scp
(secure copy) which acts very similar to the normal cp
(copy) command, and rsync
which synchronizes files and folders between a source and a destination. In either case, data is transferred over an SSH connection so you should factor in SSH’s encryption overheads when comparing performance.
Both of these tools have many options to tweak how they run (see the relevant man pages), but in general the basic syntax for either is:
$ <command> /path/to/source user@host:/path/to/destination
scp
The scp
command creates a copy of a file (or a directory and its contents if using the -r
option), copying from the source to the destination. For example, to transfer files to gruffalo
from your local client:
$ scp -r /path/to/source <username>@gruffalo.cropdiversity.ac.uk:/path/to/destination
Note
Pay attention to the :
after the hostname. Miss it out and you’ll end up with a local copy of your file named after your destination, rather than copying it to your destination.
The /path/to/destination
is optional. If you don’t include it, copied files will end up in your home folder on the destination. If you don’t provide the full path to the source, then scp
should be executed in the directory containing the source file/directory.
rsync
rsync
also copies files over a network connection (employing a special delta transfer algorithm to make things a bit faster) but supports resuming an interrupted transfer with its -P
flag. It’s also great for keeping two folders synchronized because it only copies files not already on the destination, or that it detects are different between source and destination.
A common way of synchronizing folders is:
$ rsync -avP --delete /path/to/source <username>@gruffalo.cropdiversity.ac.uk:/path/to/destination
-avP
enables archive mode, verbose output, and resumable transfers, whereas --delete
removes any files on the destination path that are no longer in the source, so use with caution!
Note
Pay attention to whether you have a trailing slash (/
) on the source or not. No slash means you want to copy the directory and its contents, whereas including a slash means you only want to copy the contents.
Here is a short video demonstration of using rsync to import a folder of data from another Linux server external to Crop Diversity.
Graphical options
How to use the various graphical file transfer programs that are available is beyond the scope of this help, but you’ll have many options to pick from regardless of which platform you use. Most of them utilise the same underlying SSH mechanism as the scp
and rsync
tools described above.
Important
Remember that any graphical tool you use needs to access your private key if connecting to gruffalo
away from a Supported Organisations network.
Here are a few recommended clients to get you started:
SSHFS - all platforms, although it can be a little tricky to set up, especially on Windows
FileZilla - all platforms
CyberDuck - macOS or Windows, suggest setting double click to open in editor
MobaXterm - Windows only, but enables a graphical file browser in addition to normal SSH functionality
WinSCP - Windows only
Note
JHI users can also enable Samba Access for easy graphical file browsing.