From CTL
Contents |
HPC tutorial: Introduction to DOLFIN/UNICORN
Servers where you can work:
General: you can login from home and work on them
* descartes.csc.kth.se * riesz.csc.kth.se
Batch server: use it only for batch computing
* hydra.csc.kth.se
PDC Supercomputers:
* lindgren
To be able to keep your program running after logging out you should use the program 'screen'.
AFS
When you logging you are in your home directory which is managed by AFS. To whichever server you connect, you end up in the same directory which is shared across the network.
First you should make sure you have a valid Kerberos token:
larcher@hydra:~$ klist Credentials cache: FILE:/tmp/krb5cc_7993_g13719 Principal: larcher@NADA.KTH.SE
Issued Expires Principal Jul 3 14:23:55 Jul 4 00:23:55 krbtgt/NADA.KTH.SE@NADA.KTH.SE Jul 3 14:23:56 Jul 4 00:23:55 afs@NADA.KTH.SE Jul 3 14:23:56 Jul 4 00:23:55 afs/pdc.kth.se@NADA.KTH.SE
To get a new identification token use:
larcher@hydra:~$ kinit larcher@NADA.KTH.SE
To activate your identification to the AFS system or update it (credentials for instance)
larcher@hydra:~$ aklog
or (afslog)
NOBACKUP
You should run the computations in a dedicated directory located in /NOBACKUP
Module
In order to manage the software you should use the program 'module'.
First, setup the path to the modules:
larcher@descartes:~$ export MODULEPATH=/afs/nada.kth.se/dept/na/ctl/pkg/@sys/modulefiles
You can then check the list of modules:
larcher@descartes:~$ module avail
--------------------------------- /afs/nada.kth.se/dept/na/ctl/pkg/@sys/modulefiles ---------------------------------- dolfin-hpc/0.8.0 ffc/0.5.1 paraview/3.14.1 unicorn-hpc/0.1.0 unicorn-hpc/current dolfin-hpc/0.8.1 fiat/0.3.4 parmetis/3.1.1 unicorn-hpc/0.1.1 valgrind/3.7.0 dolfin-hpc/0.8.2 instant/0.9.5 petsc/3.0.0 unicorn-hpc/0.1.2 dolfin-hpc/current paraview/3.12.0 ufc/1.1 unicorn-hpc/0.1.3
To load a module:
larcher@descartes:~$ module add unicorn-hpc
To list the loaded modules:
larcher@descartes:~$ module list Currently Loaded Modulefiles: 1) unicorn-hpc/current 2) dolfin-hpc/current
By default the system is picking the most recent module version. If you have several versions, you can chose a specific one by using 'module swap'.
DOLFIN/UNICORN Tutorial
First check out the latest version of the tutorial with Bazaar:
larcher@descartes:~$ bzr branch /afs/nada.kth.se/dept/na/ctl/repo/bzr/hpc-tutorial Branched 8 revision(s).
TODO: Describe the files.
larcher@descartes:~/hpc-tutorial$ ls -dl * -rw-r--r-- 1 larcher dip 1505 2012-07-03 13:59 chkp.cpp -rwxr-xr-x 1 larcher dip 164 2012-07-03 13:59 daisy.csh -rw-r--r-- 1 larcher dip 141130 2012-07-03 13:59 hpc-tutorial-2012.pdf -rw-r--r-- 1 larcher dip 158196 2012-07-03 13:59 hpc-tutorial.pdf -rw-r--r-- 1 larcher dip 491 2012-07-03 13:59 Makefile -rw-r--r-- 1 larcher dip 3582 2012-07-03 13:59 mesh.cpp -rw-r--r-- 1 larcher dip 1225 2012-07-03 13:59 minimal.cpp -rw-r--r-- 1 larcher dip 52 2012-07-03 13:59 parameters -rw-r--r-- 1 larcher dip 52 2012-07-03 13:59 parameters_restart -rw-r--r-- 1 larcher dip 182 2012-07-03 13:59 submitfile -rw-r--r-- 1 larcher dip 179 2012-07-03 13:59 submitfile_chkp -rw-r--r-- 1 larcher dip 183 2012-07-03 13:59 submitfile_restart -rw-r--r-- 1 larcher dip 2958009 2012-07-03 13:59 usquare.xml
Make sure that 'dolfin-hpc' and 'unicorn-hpc' are loaded before running the test:
larcher@descartes:~/hpc-tutorial$ module list Currently Loaded Modulefiles: 1) unicorn-hpc/current 2) dolfin-hpc/current
Three C++ files are provided with the tutorial as an introduction to DOLFIN
larcher@descartes:~/hpc-tutorial$ ls -l *.cpp -rw-r--r-- 1 larcher dip 1505 2012-07-03 13:59 chkp.cpp -rw-r--r-- 1 larcher dip 3582 2012-07-03 13:59 mesh.cpp -rw-r--r-- 1 larcher dip 1225 2012-07-03 13:59 minimal.cpp
Example: mesh.cpp
* Create a mesh object by loading the unit square mesh described in the XML file 'usquare.xml':
Mesh mesh("usquare.xml");
* The best way to print from DOLFIN is to use the function dolfin::message * To save meshes, functions, vectors and matrices you should use the "File" class:
File f_mesh("mesh.pvd"); f_mesh << mesh;
To build the examples:
larcher@descartes:~/hpc-tutorial$ make `pkg-config --variable=compiler dolfin` `pkg-config --cflags unicorn` -I./ -I../ mesh.cpp `pkg-config --libs unicorn` -o mesh `pkg-config --variable=compiler dolfin` `pkg-config --cflags unicorn` -I./ -I../ minimal.cpp `pkg-config --libs unicorn` -o minimal `pkg-config --variable=compiler dolfin` `pkg-config --cflags unicorn` -I./ -I../ chkp.cpp `pkg-config --libs unicorn` -o chkp
Then you can run an example on two processors:
larcher@descartes:~/hpc-tutorial$ mpirun -np 2 ./mesh Initializing DOLFIN version 0.8.2-hpc. Initializing DOLFIN version 0.8.2-hpc. *** Warning: Reading DOLFIN xml meshes in parallel is depricated. For better I/O performance, consider converting to flat binary *** Warning: Reading DOLFIN xml meshes in parallel is depricated. For better I/O performance, consider converting to flat binary Mesh loaded with 16641 vertices Rank: 0 has 8294 vertices Rank: 0 has 83 ghosted and 137 shared vertices Rank: 0 vertex 10 has global number 10 Rank: 1 has 8484 vertices Rank: 1 has 54 ghosted and 137 shared vertices Rank: 1 vertex 10 has global number 8221
... ...
In that case the program loads a mesh, distributes it on 2 CPUs, call the refinement and balances the newly created refined mesh on the 2 CPUs.
Example: minimal.cpp
This file gives an example of a basic solver.
The 'unicorn_init' function needs arguments to run a computations:
larcher@descartes:~/hpc-tutorial$ ./minimal Initializing DOLFIN version 0.8.2-hpc. Initializing Unicorn version 0.1.3-hpc. Usage: -p <parameters> [-m <mesh> -c <checkpoint>] [-i iteration] [-l <wall clock limit>] [-o <petsc arguments>] [-s <structure mesh>]
You need to provide a 'parameters' file and a mesh:
larcher@descartes:~/hpc-tutorial$ ./minimal -p parameters -m usquare.xml Initializing DOLFIN version 0.8.2-hpc. Initializing Unicorn version 0.1.3-hpc. Running on 1 node Global number of vertices: 16641 Global number of cells: 32768 Running iteration 0 of 5 Pre Solver Post Running iteration 1 of 5 Pre Solver Post Running iteration 2 of 5 Pre Solver Post Running iteration 3 of 5 Pre Solver Post Running iteration 4 of 5 Pre Solver Post
Queue system on Hydra
Make sure you et the MODULEPATH environment again if you log on hydra.
Show the queue:
larcher@hydra:~$ qstat Job id Name User Time Use S Queue ------------------------- ---------------- --------------- -------- - ----- 6401.hydra wccm2012 spuhler 110:30:2 R batch
You need to use a submit file to add a job in the queue, like 'submitfile':
#PBS -N unicorn #PBS -l walltime=00:10:00,nodes=1:ppn=4 #PBS -m abe #PBS -v KRB5CCNAME cd $PBS_O_WORKDIR afslog mpirun --hostfile $PBS_NODEFILE minimal -p parameters -m usquare.xml
To add the job:
qsub submitfile