## page was renamed from GR2DOC/Tools/Grid/GPU
## page was renamed from Documentation/Tools/Grid/GPU
## page was renamed from Cluster/Operational/Tools/Grid/GPU
#acl All:read

= Scheduling GPU resources in the Grid =

{i} Full documentation at [[Documentation/Tools/Grid/GPU/GPUsOnGrid|this page]].

== Underlying tale of installations, applied configuration and tweaks ==


=== NVIDIA worker ===

 1. Blacklist `nouveau`.
    To avoid compilation errors (aka '''ERROR: Unable to load the kernel module 'nvidia.ko'...''') when installing NVIDIA driver, is often not enough to include `blacklist nouveau` in `/etc/modprobe.d/blacklist.conf`. It is also required to remove it from the initrd image like so:
    {{{
    # echo -e "blacklist nouveau\noptions nouveau modeset=0" > /etc/modprobe.d/disable-nouveau.conf
    # mkinitrd -f -v /boot/initrd-$(uname -r).img $(uname -r)  # or `dracut -f`
    }}}
    and reboot.

 1. Install `./NVIDIA-Linux-x86_64-325.08.run` or whatever other version.

 1. Install `./cudatoolkit_4.0.17_linux_64_rhel6.0.run` or whatever other version.

 1. Check `nvidia-smi` command output.
    If not supported or NA information is found:
    {{{
    (..)
    +-----------------------------------------------------------------------------+
    | Compute processes:                                               GPU Memory |
    |  GPU       PID  Process name                                     Usage      |
    |=============================================================================|
    |    0            Not Supported                                               |
    |    1            Not Supported                                               |
    |    2            Not Supported                                               |
    |    3            Not Supported                                               |
    +-----------------------------------------------------------------------------+
    }}}
    we need to patch `libnvidia-ml.so.1` library:
     1. Get patch from Github's [[https://github.com/CFSworks/nvml_fix|nvml_fix]] repository.
     1. Compile it with `TARGET=<your-nvidia-driver-version>` (must be supported by the fix).
         * HACK: in Scientific Linux 6 it must be compiled with `pthread` and `dl` libraries:
           {{{
           # cat Makefile
           (..)
           CFLAGS        = -lpthread -ldl
           (..)
           }}}
     1. Remove the link `/usr/lib64/libnvidia-ml.so.1` and substitute it with the just created `$PWD/libnvidia-ml.so.1` file.
         * Note that we use '''lib64''' (not the default Makefile's libdir location -> lib).
         * Do not use `make install PREFIX=/usr`, copy it by hand.
         * Do not create a link, since `ldconfig` will overwrite it.
    Now `nvidia-smi` output should look like:
    {{{
    (..)
    +-----------------------------------------------------------------------------+
    | Compute processes:                                               GPU Memory |
    |  GPU       PID  Process name                                     Usage      |
    |=============================================================================|
    |  No running compute processes found                                         |
    +-----------------------------------------------------------------------------+
    }}}


=== CREAM CE ===

 1. Added to BLAHP script `/usr/libexec/sge_local_submit_attributes.sh`:

    {{{
    (..)
    if [ -n $gpu ]; then
        echo "#$ -l gpu=${gpu}"
    fi
    (..)
    }}}


=== Scheduler ===

 1. [qmaster] Define complex value 'gpu':

    {{{
    #name               shortcut     type        relop requestable consumable default  urgency  
    #-------------------------------------------------------------------------------------------
    (..)
    gpu                 gpu          INT         <=    YES         YES        0        0
    (..)
    }}}
 
 1. [qmaster] Host(s) complexes:
    {{{
    hostname              tesla.ifca.es
    load_scaling          NONE
    complex_values        gpu=4,mem_free=24G,virtual_free=24G
    user_lists            NONE
    xuser_lists           NONE
    projects              NONE
    xprojects             NONE
    usage_scaling         NONE
    report_variables      NONE
    }}}

 1. Load sensor:
    {{{
hostname=`uname -n`

while [ 1 ]; do
  read input
  result=$?
  if [ $result != 0 ]; then
    exit 1
  fi
  if [ "$input" == "quit" ]; then
    exit 0
  fi


  smitool=`which nvidia-smi`
  result=$?
  if [ $result != 0 ]; then
    gpusav=0
    gpus=0
  else
    gpustotal=`nvidia-smi -L|wc -l`
    gpusused=`nvidia-smi |grep "Process name" -A 6|grep -v +-|grep -v \|=|grep -v Usage|grep -v "No running"|wc -l`
    gpusavail=`echo $gpustotal-$gpusused|bc`
  fi

  echo begin
  echo "$hostname:gpu:$gpusavail"
  echo end
done

exit 0
    }}}
     

 1. [qmaster] Per-host load sensor:
    {{{
    # qconf -sconf tesla
    #tesla.ifca.es:
    load_sensor                  /nfs4/opt/gridengine/util/resources/loadsensors/gpu.sh
    }}}

    * Must be available in the execution node (e.g. shared via NFS)

 1. [execd] Restart execd process to load the new sensor:
    {{{
# ps auxf
(..)
root     24786  0.0  0.0 163252  2268 ?        Sl   16:51   0:00 /nfs4/opt/gridengine/bin/lx-amd64/sge_execd
root     24798  0.0  0.0 106104  1260 ?        S    16:51   0:00  \_ /bin/sh /nfs4/opt/gridengine/util/resources/loadsensors/gpu.sh
root     24801  0.0  0.0 106104   544 ?        S    16:51   0:00      \_ /bin/sh /nfs4/opt/gridengine/util/resources/loadsensors/gpu.sh
root     24802 71.0  0.0  11140   988 ?        R    16:51   0:00          \_ nvidia-smi -L
root     24803  0.0  0.0 100924   632 ?        S    16:51   0:00          \_ wc -l
(..)
    }}}

    * soft-stop the service if there are jobs running.

 1. [qmaster] Query the GPU-host `gpu` resource:
   {{{
   # qhost -h tesla -F gpu
   HOSTNAME                ARCH         NCPU NSOC NCOR NTHR  LOAD  MEMTOT  MEMUSE  SWAPTO  SWAPUS
   ----------------------------------------------------------------------------------------------
   global                  -               -    -    -    -     -       -       -       -       -
   tesla                   lx-amd64        4    1    4    4  0.19   23.5G    1.7G   11.8G     0.0
       Host Resource(s):      hl:gpu=4.000000
   }}}


== References ==
 1. GridEngine
    * http://serverfault.com/questions/322073/howto-set-up-sge-for-cuda-devices
    * http://gridengine.org/pipermail/users/2012-April/003338.html
 1. NVIDIA CUDA
    * https://github.com/CFSworks/nvml_fix