These are the GPU facilities available for CS students:
- 30 PCs in lab121 fitted with Nvidia RTX 4060 Super TI with 16GB RAM and 4352 cores
- 25 PCs in lab105 fitted with Nvidia RTX 3090 with 24GB RAM and 10,496 cores
- Blaze server fitted with 4 Nvidia GeForce Titan X cards with 12GB Ram and 3072 cores each
The GPUs in the computer lab are ideal for code development and running short compute jobs. Blaze is a dedicated GPU server and ideally suited for running longer GPU jobs.
The lab PCs boot into Linux by default and can potentially be accessed remotely. However please note that the PCs regularly reboot on Monday and Thursday evenings between 7.30pm and midnight. These PCs are also used by others and may be rebooted at any time. If you wish to run a long compute job on a lab PC a good option would be to run the job overnight, excluding Monday and Thursday evenings, or at the weekend.
Accessing GPUs
To access a GPUhost you will need to login via SSH. If you are connected to the CS network then you can SSH directly to the GPUhost.
If you are not on the CS network you will need to either:
- use thinlinc to login to csrw2.cs.ucl.ac.uk, open Terminal and ssh to the required GPUhost
- or, ssh into our SSH gateway knuckles.cs.ucl.ac.uk and then ssh into GPUhost.
You can also do this with one command: ssh -l alice -J alice@knuckles.cs.ucl.ac.uk canada-l.cs.ucl.ac.uk (replace alice
with your CS username and canada-l
with the name of the CS machine you wish to connect to).
lab105:
aylesbury-l barnacle-l breeze-l brent-l bufflehead-l cackling-l
canada-l crested-l eider-l gadwall-l goosander-l gressingham-l
harlequin-l kamzi-l mallard-l mandarin-l pintail-l pocher-l
ruddy-l scaup-l scoter-l shelduck-l shoveler-l smew-l
wigeon-l
lab121:
lab121
albacore-l barbel-l chub-l cripps-l dory-l elver-l
flounder-l goldeye-l hake-l inanga-l javelin-l koi-l
lamprey-l mackerel-l mullet-l nase-l opah-l pike-l
plaice-l quillback-l roach-l rudd-l shark-l skate-l
tench-l tope-l uaru-l vimba-l whitebait-l yellowtail-l
zander-l
Checking GPU availability
As Linux is a multiuser system you will need to check that another user is not running a job on the GPU. The command nvidia-smi
will display a summary of the GPU cards, as well as the processes that each card is working on. If all the GPU cards are occupied you will have to either run your jobs on a different system, or wait for the card to become free.
On systems fitted with more than one GPU card such as Blaze it is important to only run your processes on one card. This will allow fair and efficient utilisation of the GPUs available as a shared resource. To do this, navigate to the directory /usr/local/cuda/ . Then run the following command:
source CUDA_VISIBILITY.csh
To check which GPU your system is using, and to ensure that the environment variable has been set correctly, type:
env | grep CUDA
You should the following output:
CUDA_VISIBLE_DEVICES=n
with n being a number between 0 and x. With x being the number of GPU cards installed in that system.
Setting up Tensorflow
In order to install Tensorflow in your filespace, first you need to add Python to your path by running one of the setup scripts in:
/opt/Python
as follows:
source /opt/Python/Python-3.10.1_Setup.csh
Please pick whichever version is appropriate for your work.
Then, use pip (Python package manager) to install Tensorflow/Pytorch package:
pip install torch --user
If you don’t add the –user flag, pip may attempt to install Tensorflow in the global filesystem, which you won’t have write permissions for.
For any issues with blaze, please contact us, or come and see us in person.