HOW-TO Grid Storage and Processing

Astro-WISE connected to the Grid

Grid processing is default enabled for all Astro-WISE implementations. Grid storage is enabled for the following Astro-WISE implementations:

  • Lofar (or LoWISE)

To make use of the grid services you need a valid grid certificate. To obtain a (Dutch) certificate see TCS eScience Portal. First step is creating the certificate, which will be stored in your browser. Then you need to export the certificate to your home directory, see the help page section “How to export certificates from the keystore” on TCS eScience Portal.

If your organisation is not supported on the TCS eScience Portal see dutchgrid for information how to obtain a certificate.

You grid certificate is strictly personal and should never be given to someone else. Please not that your certificate is valid for one year. Before the certificate becomes invalid you should renew it, otherwise you must request a new one.

To be able to store or retrieve data from the grid or do processing you need to make a proxy certificate, based on your grid certificate. To generate a proxy certificate and upload this to the myproxy server both the DPU on the AWE prompt or the java web tool can be used. On the AWE prompt do

awe> from common.net.dpu_client import dpu_IO
awe> dpu = dpu_IO(dpu_name='dpu.grid.target.astro-wise.org')
awe> dpu.myproxyinit()

Your proxy certificate is now valid for 7 days. After 7 days you should regenerate the proxy certificate.

An overview of scheduled and not scheduled downtime of grid servers and services can be found on the nikhef web site. In case of errors please have a look here first.

The following Astro-WISE environment variables related to grid storage can be set:

  • grid_certificate_pass
  • storage_element
  • storage_protocol
  • virtual_organisation
  • virtual_organisation_group
  • virtual_organisation_role

Only ‘srm’ is supported as grid storage protocol, so set storage_protocol to ‘srm’. You must be part of a virtual organisation (VO) to work on the grid. The virtual_organisation environment variable defines the VO you’ll be working in, examples are: lofar, omegac. With virtual_organisation_group and virtual_organisation_role you can set a group or role within a VO, these are optional. Your grid certificate is protected by a pass phrase, you can set it in the grid_certificate_pass environment variable. Or the grid_certificate_pass can be left empty, in that case you’ll be prompted for it the first time it is used. When you store files on the grid the storage_element variable can be used to specifiy storage element to use as storage, example: srm.grid.sara.nl.

Store and Retrieve Example

This example shows how to store and retrieve files from the grid. It assumes that the LoWISE system is used.

awe> from common.database.DataObject import  DataObject
# create an object
awe> my_object = DataObject()
# point to an existing file with the filename attribute
awe> my_object.filename = 'an_existing_file'
# store the object
awe> my_object.store()
# and commit to the database
awe> my_object.commit()

To retrieve the file associated with the DataObject, first delete the local file and then :

awe> my_object.retrieve()

Processing examples

To process or run jobs on the grid a special DPU must selected, namely the grid DPU. See HOW-TO Process data in a distributed (parallel) way for more information about the DPU. After the grid DPU has been selected jobs can be dispatched on the grid using this DPU. Below is an example of initializing the grid DPU and submit a job.

# Initialize the grid DPU
awe> from common.net.dpu_client import dpu_IO
awe> dpu = dpu_IO(dpu_name='dpu.grid.target.astro-wise.org')
# Initialize a test job, replace with your own job
awe> from common.net.myJob import myJob
awe> job = myJob()
awe> job.set_parameters(10) # sleep for 10 seconds
awe> dpu_dict = {'DPU_JOBS' : [job]}
# And submit job to the DPU
awe> key = dpu.getkey()
awe> dpu.submitjobs(key, jobs=[dpu_dict])
# get the status of the job
awe> dpu.getstatus(key)
# The dpu status can also be requested using browser on
# http://dpu.test.lofar.egee.rug.astro-wise.org:19001/status_reverse
# Wait untill the job is finished ....
# and retrieve the job and its log
awe> result = dpu.retrievejobs(key)

You can see the status of the dpu and the submitted jobs here

The environment (Env) can be used to configure the interface with the grid. See §[aw grid connection] for an overview of the environment variables which are applicable for the grid.

Grid tools

To interact with the grid resources 3 tools have been incorporated into Astro-WISE. JLite is used for handling proxy certificates, the dcache-srm-client provides an interface to grid storage and the myproxy tool handles my-proxies. Below follows a short description of these tools, for more in-depth functionality see the corresponding python modules.

JLite

The JLite python methods are defined in the module common.external.jlite. The proxy_init() method is used to create a proxy certificate and the proxy_destroy() method can be called to destroy a proxy. The proxy_info() method prints the proxy information and the method proxy_info_parser() returns the proxy information as a dictionary.

dcache-srm-client

The dcache-srm-client python methods are defined in the module common.external.jlite. The method srmcp() is used to copy files from/to the grid, srmls() is used to print a directory listing, srmmkdir() to create a directory and srmrm() to remove files and directories from the grid storage.

my-proxy

The my-proxy python methods are defined in the module common.external.myproxy. The method myproxy_upload() is used to upload my-proxies to a my-proxy server, myproxy_retrieve() retrieves my-proxies and myproxy_info() shows information of a my-proxy.