HPC tutorial: Introduction to DOLFIN/UNICORN

Servers where you can work:

General: you can login from home and work on them

  * descartes.csc.kth.se
  * riesz.csc.kth.se

Batch server: use it only for batch computing

  * hydra.csc.kth.se

PDC Supercomputers:

  * lindgren

To be able to keep your program running after logging out you should use the program 'screen'.

AFS

When you logging you are in your home directory which is managed by AFS. To whichever server you connect, you end up in the same directory which is shared across the network.

First you should make sure you have a valid Kerberos token:

  larcher@hydra:~$ klist
  Credentials cache: FILE:/tmp/krb5cc_7993_g13719
          Principal: larcher@NADA.KTH.SE

    Issued           Expires          Principal
  Jul  3 14:23:55  Jul  4 00:23:55  krbtgt/NADA.KTH.SE@NADA.KTH.SE
  Jul  3 14:23:56  Jul  4 00:23:55  afs@NADA.KTH.SE
  Jul  3 14:23:56  Jul  4 00:23:55  afs/pdc.kth.se@NADA.KTH.SE

To get a new identification token use:

  larcher@hydra:~$ kinit larcher@NADA.KTH.SE

To activate your identification to the AFS system or update it (credentials for instance)

  larcher@hydra:~$ aklog

or (afslog)

NOBACKUP

You should run the computations in a dedicated directory located in /NOBACKUP

Module

In order to manage the software you should use the program 'module'.

First, setup the path to the modules:

  larcher@descartes:~$ export MODULEPATH=/afs/nada.kth.se/dept/na/ctl/pkg/@sys/modulefiles

You can then check the list of modules:

  larcher@descartes:~$ module avail

  --------------------------------- /afs/nada.kth.se/dept/na/ctl/pkg/@sys/modulefiles ----------------------------------
  dolfin-hpc/0.8.0    ffc/0.5.1           paraview/3.14.1     unicorn-hpc/0.1.0   unicorn-hpc/current
  dolfin-hpc/0.8.1    fiat/0.3.4          parmetis/3.1.1      unicorn-hpc/0.1.1   valgrind/3.7.0
  dolfin-hpc/0.8.2    instant/0.9.5       petsc/3.0.0         unicorn-hpc/0.1.2
  dolfin-hpc/current  paraview/3.12.0     ufc/1.1             unicorn-hpc/0.1.3

To load a module:

  larcher@descartes:~$ module add unicorn-hpc

To list the loaded modules:

  larcher@descartes:~$ module list
  Currently Loaded Modulefiles:
    1) unicorn-hpc/current
    2) dolfin-hpc/current

By default the system is picking the most recent module version. If you have several versions, you can chose a specific one by using 'module swap'.

DOLFIN/UNICORN Tutorial

First check out the latest version of the tutorial with Bazaar:

  larcher@descartes:~$ bzr branch /afs/nada.kth.se/dept/na/ctl/repo/bzr/hpc-tutorial
  Branched 8 revision(s).

TODO: Describe the files.

  larcher@descartes:~/hpc-tutorial$ ls -dl *
  -rw-r--r-- 1 larcher dip    1505 2012-07-03 13:59 chkp.cpp
  -rwxr-xr-x 1 larcher dip     164 2012-07-03 13:59 daisy.csh
  -rw-r--r-- 1 larcher dip  141130 2012-07-03 13:59 hpc-tutorial-2012.pdf
  -rw-r--r-- 1 larcher dip  158196 2012-07-03 13:59 hpc-tutorial.pdf
  -rw-r--r-- 1 larcher dip     491 2012-07-03 13:59 Makefile
  -rw-r--r-- 1 larcher dip    3582 2012-07-03 13:59 mesh.cpp
  -rw-r--r-- 1 larcher dip    1225 2012-07-03 13:59 minimal.cpp
  -rw-r--r-- 1 larcher dip      52 2012-07-03 13:59 parameters
  -rw-r--r-- 1 larcher dip      52 2012-07-03 13:59 parameters_restart
  -rw-r--r-- 1 larcher dip     182 2012-07-03 13:59 submitfile
  -rw-r--r-- 1 larcher dip     179 2012-07-03 13:59 submitfile_chkp
  -rw-r--r-- 1 larcher dip     183 2012-07-03 13:59 submitfile_restart
  -rw-r--r-- 1 larcher dip 2958009 2012-07-03 13:59 usquare.xml

Make sure that 'dolfin-hpc' and 'unicorn-hpc' are loaded before running the test:

 larcher@descartes:~/hpc-tutorial$ module list
 Currently Loaded Modulefiles:
    1) unicorn-hpc/current
    2) dolfin-hpc/current

Three C++ files are provided with the tutorial as an introduction to DOLFIN

  larcher@descartes:~/hpc-tutorial$ ls -l *.cpp
  -rw-r--r-- 1 larcher dip 1505 2012-07-03 13:59 chkp.cpp
  -rw-r--r-- 1 larcher dip 3582 2012-07-03 13:59 mesh.cpp
  -rw-r--r-- 1 larcher dip 1225 2012-07-03 13:59 minimal.cpp

Example: mesh.cpp

  * Create a mesh object by loading the unit square mesh described in the XML file 'usquare.xml':

  Mesh mesh("usquare.xml");

  * The best way to print from DOLFIN is to use the function dolfin::message
  * To save meshes, functions, vectors and matrices you should use the "File" class:

  File f_mesh("mesh.pvd");
  f_mesh << mesh;

To build the examples:

  larcher@descartes:~/hpc-tutorial$ make
  `pkg-config --variable=compiler dolfin` `pkg-config --cflags unicorn` -I./ -I../ mesh.cpp   `pkg-config --libs unicorn`  -o mesh
  `pkg-config --variable=compiler dolfin` `pkg-config --cflags unicorn` -I./ -I../ minimal.cpp   `pkg-config --libs unicorn`  -o minimal
  `pkg-config --variable=compiler dolfin` `pkg-config --cflags unicorn` -I./ -I../ chkp.cpp   `pkg-config --libs unicorn`  -o chkp

Then you can run an example on two processors:

  larcher@descartes:~/hpc-tutorial$ mpirun -np 2 ./mesh 
  Initializing DOLFIN version 0.8.2-hpc.
  Initializing DOLFIN version 0.8.2-hpc.
  *** Warning: Reading DOLFIN xml meshes in parallel is depricated. For better I/O performance, consider converting to flat binary
  *** Warning: Reading DOLFIN xml meshes in parallel is depricated. For better I/O performance, consider converting to flat binary
  Mesh loaded with 16641 vertices
  Rank: 0 has 8294 vertices
  Rank: 0 has 83 ghosted and 137 shared vertices
  Rank: 0 vertex 10 has global number 10
  Rank: 1 has 8484 vertices
  Rank: 1 has 54 ghosted and 137 shared vertices
  Rank: 1 vertex 10 has global number 8221

  ...
  ...

In that case the program loads a mesh, distributes it on 2 CPUs, call the refinement and balances the newly created refined mesh on the 2 CPUs.

Example: minimal.cpp

This file gives an example of a basic solver.

The 'unicorn_init' function needs arguments to run a computations:

  larcher@descartes:~/hpc-tutorial$ ./minimal 
  Initializing DOLFIN version 0.8.2-hpc.
  Initializing Unicorn version 0.1.3-hpc.
  Usage: -p <parameters> [-m <mesh> -c <checkpoint>] [-i iteration] [-l <wall clock limit>] [-o <petsc arguments>] [-s <structure mesh>]

You need to provide a 'parameters' file and a mesh:

  larcher@descartes:~/hpc-tutorial$ ./minimal -p parameters -m usquare.xml 
  Initializing DOLFIN version 0.8.2-hpc.
  Initializing Unicorn version 0.1.3-hpc.
  Running on 1 node
  Global number of vertices: 16641
  Global number of cells: 32768
  Running iteration 0 of 5
  Pre
  Solver
  Post
  Running iteration 1 of 5
  Pre
  Solver
  Post
  Running iteration 2 of 5
  Pre
  Solver
  Post
  Running iteration 3 of 5
  Pre
  Solver
  Post
  Running iteration 4 of 5
  Pre
  Solver
  Post

Queue system on Hydra

Make sure you et the MODULEPATH environment again if you log on hydra.

Show the queue:

  larcher@hydra:~$ qstat 
  Job id                    Name             User            Time Use S Queue
  ------------------------- ---------------- --------------- -------- - -----
  6401.hydra                wccm2012         spuhler         110:30:2 R batch

You need to use a submit file to add a job in the queue, like 'submitfile':

  #PBS -N unicorn
  #PBS -l walltime=00:10:00,nodes=1:ppn=4
  #PBS -m abe
  #PBS -v KRB5CCNAME
  cd $PBS_O_WORKDIR
  afslog
  mpirun --hostfile $PBS_NODEFILE minimal -p parameters -m usquare.xml

To add the job:

  qsub submitfile

From CTL

Contents