Transferring Files to DeepThought¶
Transferring files to the HPC will change depending upon your OS. Thankfully, there are some excellent tools that take this from ‘potentially-scary’ to ‘click a button or two’.
Before we start, ensure that you have read the Storage Overview & Usage Guidelines.
All file-transfers are done via Secure File Transfer Protocol (SFTP), or Secure Copy Protocol (SCP). Other options, like the tool RSync are also usable. This guide will focus upon the GUI based tools, using SFTP.
Before we get started¶
The HPC is a little different that your desktop at home when it comes to storage, not just computing power. It’s a shared resource, so we cant store everybody’s data for all time - there just isn’t enough space.
On DeepThought, are two main storage tiers, with a smaller pool for your documents and scripts. Firstly our bulk storage (approx 250TB) is the ‘Scratch’ area (located at /scratch/user/$FAN) - and is slower, spinning Hard-Disk Drives (HDD’s). The smaller, hyper-fast NVMe Solid-State Drives (located at /local) are roughly 400GB on the ‘standard’ nodes (1-16) and 1.5TB on the ‘high-capacity’ nodes (19-21).
There is a critical difference between these two locations. The /scratch area is a common storage area. You can access it from all of the login, management and compute nodes on the HPC. This is not the same as /local, which is only available on each compute node. That is - if you job is running on Node001, the /local only exists on that particular node - you cannot access it anywhere else on the HPC.
There following location is treated specially as it holds the older eRSA data and is slowly being phased out.
This location will surface specific mount points on request. If you have access to these locations, they auto-mount on usage. That is, simply try and touch or access the correct location that you have access to, and it will be handled automatically for you. For example, if you have access to a mount point called ‘molecular_data’, then the following command with surface that mount point to you -
Your ‘home’ directories. This is a small amount of storage (~11TB total) to store your small bits and pieces. This is the analogous to the Windows ‘Documents’ folder.
At a command promp, your home directory usually gets shortened to ~/.
What to store in /home¶
Here is a rough guide as to what should live in your /home/$FAN directory. In general, you want small, little things is here.
- SLURM Scripts
- Results from Jobs.
- ‘Small’ Data-Sets (<5GB)
Scratch is your working space. Depending upon your dataset, you may need to run your job here - this is not optimal and will be much slower than running it from /local. Scratch is still not an area to store your data permanently - there are no backups in place for the HPC, so ensure you follow the HPC Research Data Flow and the HPC Job Data Flow.
What to store in /scratch¶
Here is a rough guide as to what should live in your /scratch/$FAN directory. In general, anything large, bulky and only needed for a little while should go here.
- Job Working Data-sets
- Intermediate files
Linux/Unix File Transfers¶
Linux / Unix based systems share native support for the SFTP Protocol. The Secure Copy Protocol (SCP) is also widely accepted, which can sometimes offer an edge in transfer speed. Tools such as RSYNC are also usable.
The Windows Sub-System for Linux¶
Since Windows 10 and Windows Server 2019, the windows Subsystem for Linux (WSL) allows you to run a Linux Distribution as a sub-system in windows. When following these instructions, a ‘terminal’ is the same as starting your WSL Distribution.
Transferring Files to the HPC¶
When using a *NIX based system, using the terminal is the fastest way to upload files to the HPC.
The Quick Version¶
Substitute your filename, FAN and Password, type scp FILENAME FAN@deepthought.flinders.edu.au:/home/FAN then hit enter. Enter your password when prompted. This will put the file in your home directory on DeepThought. It looks (when substituted accordingly) similar to:
The Longer Version¶
To download files from DeepThought, you simply need to invert that command to point to either:
- A name of a Computer that Deepthough ‘knows’ about.
- An IP Address that Deepthought can reach.
Transfers By Computer Name¶
If you know the hostname of the computer, you can substitute this to transfer files back to your machine. The command stays the same, mostly. You still follow the same idea, we just change where we are pointing it. This one assumed you are transferring it to a Linux/Unix based machine.
The command will take this form:
Transfer By IP Address¶
If you don’t know your computer IP, then the commands of:
- ip addr
Will be your friend to figure out what it is. Just like above, we slightly change the command, and sub-in an IP instead of a host-name.
Windows doesn’t support the SFTP protocol in a native way. Thankfully, there are lots of clients written to do just this for us.
Potential Client List¶
This is not an exhaustive list - feel free to use whatever you wish that supports the SFTP protocol.
This guide will focus on WinSCP.
Getting Connected with WinSCP¶
Open WinSCP, enter deepthought.flinders.edu.au as the host to connect to, and click Login. You should have a screen that looks like this.
The first time you connect up you will get a warning - this is fine, just click YES to continue on.
A connection to Deep Thought will then be created. If all goes well, you will be treated to this screen:
You can now drag and drop files between your computer (Left-hand side) and DeepThought (Right-hand side).